How to Write an RGSS Extension

Dll — Minimal Example

Using GCC, the source code:

extern __declspec(dllexport) int Add(int a, int b) { return a + b; }

or

extern __declspec(dllexport) int Add(int, int);
int Add(int a, int b) { return a + b; }

Compile to shared library. (may need -static -fPIC)

gcc -m32 -shared -s -O a.c -o a.dll

Check it:

pedump -E a.dll
ORD ENTRY_VA  NAME
  1     14c0  Add

Call it in RGSS:

Win32API.new('a.dll', 'Add', 'LL', 'L').call(3, 5) #=> 8

C++?

Replace extern with extern "C" (_Z3AddiiAdd). Plus we can use the block {} now.

extern "C" {
  __declspec(dllexport) int Add(int a, int b) { return a + b; }
}

MSVC?

Refer to this post by azurefx

Use __stdcall or the alias WINAPI and CALLBACK, or specify /Gz when compiling.

Add an .def file to export undecorated names.

EXPORTS
Add
; other functions

Then open “x86 Native Tools Command Prompt for VS 20..”,

cl /c /Ox /Gz a.cpp #=> a.obj
link /dll /def:a.def a.obj #=> a.dll

Adding LIBRARY MyDllName in .def file is just the same effect as /out:MyDllName.dll.

NoDll — Just Win32API

Convention: Return type and arguments’ types only include pointer and long. They both are 4-bytes long in x86. (Well, the void is not a big matter.)

For example, the MessageBox:

int MessageBox(
  HWND    hWnd,
  LPCTSTR lpText,
  LPCTSTR lpCaption,
  UINT    uType
);

Is just:

long MessageBox(long, void*, void*, long);

Which we may declare it in RGSS like:

MessageBox = Win32API.new('user32', 'MessageBox', 'LppL', 'L')

Notice LppL is just long, void*, void*, long.

Plus we know these types when calling such API. So,

def callapi dll, func, *args
  imports = args.map { |e| Integer === e ? 'L' : 'p' }
  Win32API.new(dll, func, imports, 'L').call(*args)
end

callapi 'user32', 'MessageBox', 0, 'content', 'title', 16

Notice how we simply pass the integer and string arguments.

How about complex (struct) argument?

Remember the convention, struct fields will just be aligned to 4-bytes.

For example, the GetCursorPos.

BOOL GetCursorPos(
  LPPOINT lpPoint
);
typedef struct tagPOINT {
  LONG x;
  LONG y;
} POINT, *PPOINT;

Is just:

long GetCursorPos(void*);
struct POINT { long; long; };

Which we may declare and use it in RGSS like:

GetCursorPos = Win32API.new('user32', 'GetCursorPos', 'p', 'L')
GetCursorPos.call(doubleL = [].pack("x#{4 * 2}"))
point = doubleL.unpack('LL') # [12, 34]

It is quite easy to define some helper methods.

def buf(template)
  [].pack("x#{template.scan(/(\w)(\d+)?/).inject(0){|s,(c,i)|s+('aAxZ'.include?(c)?1:2**('CSLQ'.index c.upcase))*(i||1).to_i}}")
end

doubleL = buf('LL') #=> "\0\0\0\0\0\0\0\0"
callapi 'user32', 'GetCursorPos', doubleL

Here is a more complex helper script.

Misc

Remove trailing spaces(' ') amd null("\0")s.

"asd \0\0\0".unpack('A*') #=> "asd"

Remove trailing nulls.

"asd \0\0\0".unpack('Z*') #=> "asd "

Add trailing null.

["asd"].pack('Z*') #=> "asd\0"
["asd\0"].pack('Z*') #=> "asd\0\0"

Convert between UTF-8 and WideChar (ASCII characters only).

'hello'.unpack('U*').pack('S*') #=> "h\0e\0l\0l\0o\0"
"h\0e\0l\0l\0o\0".unpack('S*').pack('U*') #=> 'hello'

Use ASCII-8BIT (binary) encoding.

"hello".b

NoDll — Port Dll to NoDll

This part needs x86 asm/machine code knowledge. Here is a quick reference.

Compile the Pure Binary

gcc -w -m32 -c -O -fno-ident a.c
objcopy -O binary -j .text a.o a

Take the code at the beginning for example, the out file a looks like:

8b44 2408 0344 2404 c390 9090 

View the meaning of it by objdump,

objdump -M intel -S a.o
00000000 <_Add>:
   0:   8b 44 24 08             mov    eax,DWORD PTR [esp+0x8]
   4:   03 44 24 04             add    eax,DWORD PTR [esp+0x4]
   8:   c3                      ret
   9:   90                      nop
   a:   90                      nop
   b:   90                      nop

Very good, now put it in CallWindowProc, which allows us executing machine code.

CallWindowProc = Win32API.new 'user32', 'CallWindowProc', 'pLLLL', 'L'
CallWindowProc.call [
  0x8b, 0104,0044, 8,  # [1] mov 8(%esp), %eax
  0x03, 0104,0044, 4,  # [0] add 4(%esp), %eax
  0xc2,         16,0,  #     ret $16
].pack('C*'), 3, 5, 0, 0
# => 8
# don't do this even if it won't fail
# CallWindowProc.call IO.binread('a'), 3, 5, 0, 0 #=> 8

Notice we replace 'c3 90 90' with 'c2 16 0' to meet the function type of WindowProc. For more details, see this zhihu zhuanlan article (in Chinese).

Not Enough.. Dynamic Contents

Things works well when the functions we exported are pure.

How about this? (Still use C instead of C++.)

int f(int a, int b) {
  char *s = malloc(16);
  sprintf(s, "%d + %d = %d", a, b, a + b);
  return (int)s;
}

It becomes:

f:
   0:   57                      push   edi
   1:   56                      push   esi
   2:   53                      push   ebx
   3:   83 ec 20                sub    esp,0x20
   6:   8b 74 24 30             mov    esi,DWORD PTR [esp+0x30]
   a:   8b 7c 24 34             mov    edi,DWORD PTR [esp+0x34]
   e:   c7 04 24 10 00 00 00    mov    DWORD PTR [esp],0x10
  15:   e8 00 00 00 00          call   1a <_f+0x1a>
  1a:   89 c3                   mov    ebx,eax
  1c:   8d 04 3e                lea    eax,[esi+edi*1]
  1f:   89 44 24 10             mov    DWORD PTR [esp+0x10],eax
  23:   89 7c 24 0c             mov    DWORD PTR [esp+0xc],edi
  27:   89 74 24 08             mov    DWORD PTR [esp+0x8],esi
  2b:   c7 44 24 04 00 00 00    mov    DWORD PTR [esp+0x4],0x0
  32:   00
  33:   89 1c 24                mov    DWORD PTR [esp],ebx
  36:   e8 00 00 00 00          call   3b <_f+0x3b>
  3b:   89 d8                   mov    eax,ebx
  3d:   83 c4 20                add    esp,0x20
  40:   5b                      pop    ebx
  41:   5e                      pop    esi
  42:   5f                      pop    edi
  43:   c3                      ret

Notice the 15, 2b and 36 rows. We need these addresses at runtime:

Offset Address of
0x16 malloc
0x2f "%d + %d = %d"
0x37 sprintf

That’s all!

© 2019 hyrious