Assembly Wrapping

Hidden Within Plain Sight

Nov 05, 2022

Hello all and welcome back. I was busy wasting time at work and was researching some interesting topics I could write about when I found this article. It blew my mind, namely the last section, and I definitely and immediately wanted to write about it.

Let’s go over the concept. It has to do with assembly so make sure you’ve brushed up on your x64, contrary to our commonly used x86.

Look at this assembly code with me for a bit. What does it look like to you?

do_some_math PROC
push   rbp
push   rdi
movabs rax,0xeecc04eb28ec8348
movabs rbx,0x5eb00007fffbe41
add    rax,rbx
movabs rbx,0x112207eb20e6c149
or     rax,rbx
movabs rbx,0x14eb419e1780bf41
xor    rax,rbx
mov    rbx,QWORD PTR [rsp]
add    rax,rbx
mov    rbx,QWORD PTR [rsp+0x4]
add    rax,rbx
movabs rbx,0x5ebd6ff41fe014d
sub    rax,rbx
movabs rbx,0x778877c328c48348
imul   rax,rbx
shr    rax,0x4
pop    rdi
pop    rbp
do_some_math ENDP

You don’t need to be an ASM pro to figure out what this function is doing. Slow down and look at the key instructions. It’s taking very large numbers and performing some serious arithmetic on them. Looks harmless, right?

Well, what if I told you there is another function inside of this assembly code?

What? Where?

Yeah. That was my reaction too.

Under the right circumstances, if you place the instruction pointer within this function, something else entirely will happen, minus a crash, of course.

Now, I’m not saying you’re just patching over this function with another function that you want. This is something different entirely.

Let’s zoom out a little bit and look at the opcodes that would be generated by this code.

55                      push   rbp
57                      push   rdi
48 b8 48 83 ec 28 eb    movabs rax,0xeecc04eb28ec8348
04 cc ee
48 bb 41 be ff 7f 00    movabs rbx,0x5eb00007fffbe41
00 eb 05
48 03 c3                add    rax,rbx
48 bb 49 c1 e6 20 eb    movabs rbx,0x112207eb20e6c149
07 22 11
48 0b c3                or     rax,rbx
48 bb 41 bf 80 17 9e    movabs rbx,0x14eb419e1780bf41
41 eb 14
48 33 c3                xor    rax,rbx
48 8b 1c 24             mov    rbx,QWORD PTR [rsp]
48 03 c3                add    rax,rbx
48 8b 5c 24 04          mov    rbx,QWORD PTR [rsp+0x4]
48 03 c3                add    rax,rbx
48 bb 4d 01 fe 41 ff    movabs rbx,0x5ebd6ff41fe014d
d6 eb 05
48 2b c3                sub    rax,rbx
48 bb 48 83 c4 28 c3    movabs rbx,0x778877c328c48348
77 88 77
48 0f af c3             imul   rax,rbx
48 c1 e8 04             shr    rax,0x4
5f                      pop    rdi
5d                      pop    rbp

Now, what if I told you that the opcode for a short relative jump was EB. Look a little closer. I’ll bold them for you.

55                      push   rbp
57                      push   rdi
48 b8 48 83 ec 28 eb    movabs rax,0xeecc04eb28ec8348
04 cc ee
48 bb 41 be ff 7f 00    movabs rbx,0x5eb00007fffbe41
00 eb 05
48 03 c3                add    rax,rbx
48 bb 49 c1 e6 20 eb    movabs rbx,0x112207eb20e6c149
07 22 11
48 0b c3                or     rax,rbx
48 bb 41 bf 80 17 9e    movabs rbx,0x14eb419e1780bf41
41 eb 14
48 33 c3                xor    rax,rbx
48 8b 1c 24             mov    rbx,QWORD PTR [rsp]
48 03 c3                add    rax,rbx
48 8b 5c 24 04          mov    rbx,QWORD PTR [rsp+0x4]
48 03 c3                add    rax,rbx
48 bb 4d 01 fe 41 ff    movabs rbx,0x5ebd6ff41fe014d
d6 eb 05
48 2b c3                sub    rax,rbx
48 bb 48 83 c4 28 c3    movabs rbx,0x778877c328c48348
77 88 77
48 0f af c3             imul   rax,rbx
48 c1 e8 04             shr    rax,0x4
5f                      pop    rdi
5d                      pop    rbp

Do you see them now? A short relative jump sends the instruction pointer up N bytes. So these can be followed to find the next instruction that would be executed. Let’s look one more time.

55                      push   rbp
57                      push   rdi
48 b8 48 83 ec 28 eb    movabs rax,0xeecc04eb28ec8348
04 cc ee
48 bb 41 be ff 7f 00    movabs rbx,0x5eb00007fffbe41
00 eb 05
48 03 c3                add    rax,rbx
48 bb 49 c1 e6 20 eb    movabs rbx,0x112207eb20e6c149
07 22 11
48 0b c3                or     rax,rbx
48 bb 41 bf 80 17 9e    movabs rbx,0x14eb419e1780bf41
41 eb 14
48 33 c3                xor    rax,rbx
48 8b 1c 24             mov    rbx,QWORD PTR [rsp]
48 03 c3                add    rax,rbx
48 8b 5c 24 04          mov    rbx,QWORD PTR [rsp+0x4]
48 03 c3                add    rax,rbx
48 bb 4d 01 fe 41 ff    movabs rbx,0x5ebd6ff41fe014d
d6 eb 05
48 2b c3                sub    rax,rbx
48 bb 48 83 c4 28 c3    movabs rbx,0x778877c328c48348
77 88 77
48 0f af c3             imul   rax,rbx
48 c1 e8 04             shr    rax,0x4
5f                      pop    rdi
5d                      pop    rbp

Look at all of those bytes. Now, to extract them and disassemble them. You can use an online disassembler.

C3 is the opcode for ret, which ends a subroutine.

sub    rsp,0x28
jmp    0xa
mov    r14d,0x7fff
jmp    0x13
shl    r14,0x20
jmp    0x1b
mov    r15d,0x419e1780
jmp    0x30
add    r14,r15
call   r14
jmp    0x29
add    rsp,0x28
ret

Let’s get rid of those jmp instructions.

sub    rsp,0x28
mov    r14d,0x7fff
shl    r14,0x20
mov    r15d,0x419e1780
add    r14,r15
call   r14
add    rsp,0x28
ret

This looks very interesting. What’s happening here is an address is loaded into a register (R14), which is called. Let’s look at how this is done.

Firstly, the stack frame is prepared

sub    rsp,0x28

Then, the lower 32 bits of R14 are set to 0x7FFF.

mov    r14d,0x7fff

R14 is shifted left by 32 (0x20) bits. This sets its value to 0x7FFF00000000.

shl    r14,0x20

The lower 32 bits of R15 are set to 0x419e1780.

mov    r15d,0x419e1780

R14 and R15 are added together to equal 0x7FFF419e1780.

add    r14,r15

Now that I’m writing this. I think I could’ve just done mov r14d value and be done with it. Oh well.

Lastly, the address that R14 contains is called.

call   r14

And cleanup.

add    rsp,0x28
ret

All of this hidden within another function. Not very easy for disassemblers to catch something like that, hm?

I wrote a program and pairing DLL that hides and calls a wrapped function. It’s on my Tor site under programs called “asmwrap”. Let’s look at the DLL in IDA for a little bit.

There’s a function called func_with_no_xrefs. It, of course, has no xrefs.

I had to compile the DLL under Visual Studio’s Debug configuration since a function with no xrefs would be optimized out.

There’s also a function called do_some_math. Look familiar?

This looks completely harmless. However, the large 8-byte numbers are going to be replaced with the custom call function. test.cpp does this. It’s pretty ugly, so I won’t paste it here. It loads the asmwrap DLL, patches over the instructions with the wrapped code, and calls it. It even invokes the math function before and after patching over it.

That’s all I wanted to share for this post. It’s pretty technical. I usually leave that for my paid posts, but this was so cool, I wanted to share it with everyone.

This is such a cool concept, and it’s scalable too! Imagine hiding an encryption key or some malicious shellcode inside of a function that’s actually useful. Very, very cool.

Have a great weekend!

Go!

-BowTiedCrawfish

Shellfish Systems and Security

Discussion about this post