Declare mem_copy_* functions as FASTCALL
Split sh4_flush_store_queue into TLB/non-TLB versions, and optimize
slightly based on that
src/mem.h
src/sh4/mmu.c
src/sh4/sh4core.h
src/sh4/sh4mem.c
src/sh4/sh4x86.in
src/test/testsh4x86.c
Split sh4_flush_store_queue into TLB/non-TLB versions, and optimize
slightly based on that
src/mem.h
src/sh4/mmu.c
src/sh4/sh4core.h
src/sh4/sh4mem.c
src/sh4/sh4x86.in
src/test/testsh4x86.c
Add shortcut test for long writes to the store queue (far and away the most popular P4 write)
src/sh4/sh4mem.c
src/sh4/sh4mem.c
Add --enable-profiled configure option for convenience (and enable fastcall only on fully optimized builds)
config.h.in
configure
configure.in
src/lxdream.h
config.h.in
configure
configure.in
src/lxdream.h
Fix x86-64 build (typos et al)
Remove Push/pop ebx - don't really need it and saves adding more target-specific asm
src/sh4/ia64abi.h
src/sh4/sh4x86.in
Remove Push/pop ebx - don't really need it and saves adding more target-specific asm
src/sh4/ia64abi.h
src/sh4/sh4x86.in
Change xlat_get_native_pc to pass in the expected code region - this lets the Mac
unwind implementation range test the IP address (which works) rather than EBP
(which doesn't for some reason).
Remove the test in configure that prevents fomit-frame-pointer being used in Mac
builds.
configure
configure.in
src/sh4/ia32abi.h
src/sh4/ia32mac.h
src/sh4/sh4trans.c
src/sh4/xltcache.h
unwind implementation range test the IP address (which works) rather than EBP
(which doesn't for some reason).
Remove the test in configure that prevents fomit-frame-pointer being used in Mac
builds.
configure
configure.in
src/sh4/ia32abi.h
src/sh4/ia32mac.h
src/sh4/sh4trans.c
src/sh4/xltcache.h
Use regparam calling conventions for all functions called from translated code,
along with a few other high-use functions. Can probably extend this to all functions,
but as it is this is a nice performance boost
src/lxdream.h
src/sh4/ia32abi.h
src/sh4/ia32mac.h
src/sh4/mmu.c
src/sh4/sh4.c
src/sh4/sh4core.h
src/sh4/sh4mem.c
src/sh4/sh4stat.h
src/sh4/sh4stat.in
src/sh4/sh4trans.c
...
along with a few other high-use functions. Can probably extend this to all functions,
but as it is this is a nice performance boost
src/lxdream.h
src/sh4/ia32abi.h
src/sh4/ia32mac.h
src/sh4/mmu.c
src/sh4/sh4.c
src/sh4/sh4core.h
src/sh4/sh4mem.c
src/sh4/sh4stat.h
src/sh4/sh4stat.in
src/sh4/sh4trans.c
...
Enable the FIPR SSE3 code for now, and add a comment on the sh4r.fr alignment
src/sh4/sh4.h
src/sh4/sh4x86.in
src/sh4/sh4.h
src/sh4/sh4x86.in
Add SSE3 versions of FIPR and FTRV - the latter is about a 4.5% improvement
src/sh4/sh4.c
src/sh4/sh4.h
src/sh4/sh4x86.in
src/sh4/x86op.h
src/sh4/sh4.c
src/sh4/sh4.h
src/sh4/sh4x86.in
src/sh4/x86op.h
.