Recording audio of a single process: Part II

Today we will focus on mapping the functions found. They all reside in the AudioSes.dll, so this will be our primary target.

Interfaces and IAT

The problem with the IAudioClient and IAudioRenderClient interfaces is, like with all other interfaces, they don’t have their functions exported in the DLL, but are called using their VTables. So you don’t get their functions mapped nicely into your IAT, but have to dig a little deeper.  If you are unfamilar with VTables and/or IAT, I suggest you to at least look up the Virtual function tables concept before contiuning to read. I will briefly explain it though.

Virtual functions tables in MSVC

In MSVC classes with virtual functions have their VTable pointer placed at +0x0 in memory. So the very first field contains the pointer to a table where all virtual functions are stored. You can imagine the table being an array of function pointers, each 4 bytes (on x86) in size. During compilation, the compiler replaces a call to object->SomeInterfaceMethod with its index in the virtual function table which could look like this: object->VFT[index].

Mapping addresses from VFT to DLL

In order to find the implementation of the interface functions in our audio DLL, one can simply check the implementation of the interfaces in Audioclient.h. You could also simply call the functions and check the asm output, which will call the function like this:

mov    eax, pAudioClient
mov    eax, [eax+index*4]
call   eax

You can then retrieve the actual location using something like this:

DWORD vTable = *(DWORD*)pAudioClient;
DWORD getCurrentPaddingOffset = *(DWORD*)(vTable + 0x18);

Then calculating the offset is easy: getCurrentPaddingOffset Audioses base. For Windows 8 (32) bit, GetCurrentPadding would be at 10018D7D (VTable index is 0x18).

So now we know where the interfaces functions are stored in the DLL.

Hooking the functions

Hooking the functions is no different than normal hooking. We could either use the static offsets we gathered or dynamically resolve the addresses from the VTable and use them. The latter is version independent so it’s preferable.

2 thoughts on “Recording audio of a single process: Part II

  1. Ignacio Barreto

    Hi, nice articles with clear explanations.
    I have a doubt in order to retrieve the address of the virtual table. I’ve seen many examples and no one explains this with detail.
    Please confirm me if I’m correct or not.
    In this sentence
    DWORD vTable = *(DWORD*)pAudioClient;
    I understand that this is the way to obtain the address of the virtual table of pAudioClient object to further hook and change some of its methods.
    The most external * means to obtain a pointer. But what about (DWORD*)? I don’t understand why is the purspose of this cast. Is it to obtain an address?
    I want to know with great detail.
    I’ve seen in other examples that is not necessary to be DWORD, could be int, void or other typedef.
    I’ll be waiting your answer,
    have a great 2016 and thanks!

    1. LMS Post author

      Correct, that obtains a pointer to the VTable. What the cast basically does is: (DWORD*) makes the compiler treat pAudioClient as a pointer to a 4 byte (DWORD) value (address in your case). This value/address is the address to the VTable of an audio client. The first (external) * then dereferences the pointer to the VTable address and hence we end up having the actual VTable address in vTable. void* would work here as well, and so would int, because all we care about here is having a type that uses 4 bytes, as that is the size of the address to the VTable (assuming 32bit). Thanks, I wish you a happy 2016 too!


Leave a Reply

Your email address will not be published. Required fields are marked *