Worked on some minor translator related optimizations this week, for some decent gains. If I can knock off another 10-20% I should be just about in the “real-time” ballpark at long last. The big candidates are memory accesses (sh4_read_long is nearly 20% of the total runtime at the moment by itself), and PVR2 status updates (close to 20% under at least some loads). It’s not that either is particular complex or expensive, they’re just called a lot. I can probably knock another 1-2% off the main translate-and-execute loop too.
- Fixed heap smash in the translation cache
- Added initial GLSL shader support
- Rewrote translation exit block (Gained ~10% performance out of it and freed up EDI in case it’s worth using elsewhere). The system also seems to build correctly with -O2 now, which gives another 10% improvement.