Quote:
heh, is this a port of the app you showed off back in the CE days? -awesome getting that ported over
Not exactly a port, it is a clean remake based on the old ideas.
Quote:
Originally Posted by clrokr View Post
Would you mind giving a technical explanation?
The idea is very simple:
- a PE file loader (load files, process relocs, run TLS callbacks in an emulation mode). Support import loops (DLL A imports B while B imports A), ordinals, etc.
- a set of wrapper x86 DLLs (kernel32_stub.dll and so on) that "look like" the corresponding Win API functions for an emulated program:
Code:
#define DEFINE_FUNC1(name)      \
static const ModuleDef str_##name={DLL_NAME,#name};     \
EXTERN_C DW STUB_EXPORT stub_##name(DW p1)              \
{       \
        DW *p=&p1;      \
        __asm { mov eax,p }     \
        __asm { jmp f1 }        \
        __asm { mov eax,offset str_##name }     \
f1:     __asm { in eax,0xe5 }   \
        __asm { mov p,eax }     \
        return (DW)p;   \
}
.....
#define DEFINE_FUNC3(name)      \
static const ModuleDef str_##name={DLL_NAME,#name};     \
EXTERN_C DW STUB_EXPORT stub_##name(DW p1,DW p2,DW p3)          \
{       \
        DW *p=&p1;      \
        __asm { mov eax,p }     \
        __asm { jmp f1 }        \
        __asm { mov eax,offset str_##name }     \
f1:     __asm { in eax,0xe5 }   \
        __asm { mov p,eax }     \
        return (DW)p;   \
}
....
DEFINE_FUNC1(AddAtomA)
DEFINE_FUNC1(AddAtomW)
DEFINE_FUNC7(CreateFileA) -- number in macro == number of parameters to a __stdcall WinAPI function. 
Compiler automatically generates "ret N*4" at the end of such function. 
I've decided to use such c+asm approach instead of making a tiny assebler stub, 
as I can easily implement some of such functions in C directly in a stub DLL plus it 
simplifies debugging. And the functions have a usual C prologue/epilogue, so that 
the emulated program may even patch them in runtime, for example for hooks.
...
- a 32-bit x86 emulation engine (currently 2 engines: from bochs and from dosbox, planning on adding my own) that intercepts the command "in eax,0xe5", determines which API is needed by a program and passes it to a handler.
- native (arm) API handler DLLs (kernel32_yact.dll and so on). They are mostly autogenerated too:
Code:
#define DEFINE_FUNC1(name) 	\
EXTERN_C DW STUB_IMPORT name(DW);	\                     -- this behaves like a function prototype to compiler
EXTERN_C DW STUB_EXPORT yact_##name(DW *R)		\     -- R - pointer to the x86 stack 
{	\
  DW r=name(p1);	\         // call the func passing it paramers from the emulated stack, p1==R[0], p2==R[1] and so on
  LEAVE(1);		\         // empty macro, as the stack is unwinded in x86 stub DLL now
  return r;		\
}
...
#define DEFINE_FUNC3(name) 	\
EXTERN_C DW STUB_IMPORT name(DW,DW,DW);	\
EXTERN_C DW STUB_EXPORT yact_##name(DW *R)		\
{	\
  DW r=name(p1,p2,p3);	\
  LEAVE(3);		\
  return r;		\
}
...
DEFINE_FUNC1(AddAtomA)
DEFINE_FUNC1(AddAtomW)
DEFINE_FUNC7(CreateFileA)  // as you see - implementation part is identical to an x86 stub, so I can use the same stub-generator tool
Some of the functions require complex emulation due to their absence in ARM or due to the callbacks to x86 code:
Code:
static DWORD WINAPI ThreadProc(
  LPVOID lpParameter	// [0] == orig func, [1] == orig param
)
{
	__EXCEPTION_REGISTRATION_RECORD R;
	DWORD *Parm=(DWORD*)lpParameter;
	DWORD *TEB=(DWORD*)PeLdrGetCurrentTeb();
	R.Next=(__EXCEPTION_REGISTRATION_RECORD*)-1;
	R.Handler=(void*)CbReturnToHost();
	TEB[0]=(DWORD)&R;	// in case of unhandled exception - just return 
	PeLdrNotifyNewThread(NULL,DLL_THREAD_ATTACH);

	DWORD Ret=EmuExecute(Parm[0],1,Parm[1]); // 1 == number of parameters to the emulated function
	delete Parm;
	return Ret;
}

EXTERN_C DW STUB_EXPORT yact_CreateThread(DW *R)
{	
	DWORD* Parm=new DWORD[2];
	Parm[0]=p3;                               // TODO: no out-of-memory checking for now
	Parm[1]=p4;
	DWORD StackSize=p2;
	if(StackSize)
		StackSize+=1024*1024;      // I reserve some space for my own needs (debugging)
	else
		StackSize=2*1024*1024;     // TODO: I don't support autogrow stacks, so reserve 2 Mb

	DWORD t=(DWORD)CreateThread((LPSECURITY_ATTRIBUTES)p1,StackSize,ThreadProc,Parm,p5,(LPDWORD)p6);
	LEAVE(6);		
	return t;
}
Some of the COM interfaces are already implemented, for example DirectDraw and DirectSound, though not heavily debugged. On a desktop emulator build I can already run "Heroes of might and magic 3" and old WinRAR, but there are several RT-specific OS limitations I need to bypass before making them run on ARM. Current work in progress is: overcoming the RT limitations, manually implementing the API functions that callback to a program code (like CreateThread, RegisterClassA and so on), adding stubs for other system DLLs/COM objects.
Manually thrown SEH exceptions are fully supported, but access violation, int3 and similar OS-generated exceptions would cause program to crash. Some of the TEB fields (TLS and the fields required by the Borland compilers) are implemented too.

I don't make pointer translation in an emulated code nor make parameter checks passed to API. As a side-effect - the emulated program may trash the emulator in memory, but this greatly increases speed.
Most of the x86 EXE files don't contain relocations section and need to be loaded on the specific addresses (typically 0x400000). This is not a problem on a desktop, as I can rebase my emulator's EXE to any address I need, and free the corresponding RAM addrs for emulated program, but on ARM - this is a main problem. So currently only EXEs with relocs are supported for emulation, but there are ways to overcome this problem. And some EXEs produced by old Borland compilers contain "broken" relocs, this is a small problem too.
...