Archive for September, 2007

September 28th, 2007 by nkeynes
Optimization time
Posted in Development

Worked on some minor translator related optimizations this week, for some decent gains. If I can knock off another 10-20% I should be just about in the “real-time” ballpark at long last. The big candidates are memory accesses (sh4_read_long is nearly 20% of the total runtime at the moment by itself), and PVR2 status updates (close to 20% under at least some loads). It’s not that either is particular complex or expensive, they’re just called a lot. I can probably knock another 1-2% off the main translate-and-execute loop too.
Changes:

  • Fixed heap smash in the translation cache
  • Added initial GLSL shader support
  • Rewrote translation exit block (Gained ~10% performance out of it and freed up EDI in case it’s worth using elsewhere). The system also seems to build correctly with -O2 now, which gives another 10% improvement.
September 20th, 2007 by nkeynes
Milestone 3 released
Posted in Releases

M3 is now out as promised. Essentially the same as M2 but with better performance courtesy of the translator. I’d appreciate feedback from anyone who’d like to try it out and let me know how it runs.
M4 work plan (Dec 07):

  • Real user interface
  • Further performance improvements + optimizations
  • Video improvements (various open bugs)
  • MMU support
  • Whatever else I get time for ^_^
September 20th, 2007 by nkeynes
O frabjous day
Posted in Development

As of this morning the system successfully runs all the way through the boot sequence using the translator core. Currently performance is around 66% of full speed on the dev machine – a huge improvement over the emu core. In fact, it’s almost playable…

I’ll do some testing tonight, and as long as I don’t find anything critical there’ll be an M3 tomorrow (pretty much just M2 + SH4 translator)
Changes

  • Added tests for another group of opcodes
  • Fixed several broken instructions (translator)
  • Finished translation cache invalidation
September 18th, 2007 by nkeynes
Almost there…
Posted in Development

Many bug fixes later (mostly dumb errors of either the cut-n-paste or pure braino variety), the translator is almost running correctly. At least now it runs well enough at least to start to collect timing information – at the moment it looks to be running at around twice the overall speed with the translator running (compared to with the pure emu core) on the BIOS startup. Which is a nice start, but not really nice enough unfortunately. On some tight test loops it actually executes at 10x emu speed – closer to where I’m aiming for.
The next step (other than clearing out the remaining bugs) is to start collecting some statistics, and see if there’s some simple peephole optimizations we can apply.

September 13th, 2007 by nkeynes
August-September Update
Posted in Development

After much umming and ahing, I’ve scrapped the translator generator for the time being – it’s become far too complex for its own good, and just wasn’t going to be finished in a reasonable time. I will be keeping the original (much simpler) decoder generator though as part of the lxdream source tree. So, instead I’ve been working on finishing the instruction-at-a-time translator (ie, the simplest thing that could possibly work) with a view to getting it working and seeing what the performance is like, before getting into anything more complex.

The translator is now in an early testing form, which is to say that it’s mostly complete, and you can actually run it on real code, but it doesn’t work very well yet. I’ll be spending the next few days polishing things and getting the test suite running correctly, and then we can start testing for real.
Changes:

  • Committed decoder generator for SH4 core, disassembler, and translator
  • Committed instruction-at-a-time translator and hooked up via a command-line option
  • Merged i386 disassembler from binutils (for debugging purposes)
  • Fixed mac.l and mac.w opcodes in the emu core (they look to have been completely broken)
  • Fixed some edge cases with float and ftrc (still checking other FP opcodes)
  • Fixed crash when video driver failed to initialize (now degrades a little more gracefully)
  • Hooked up video shutdown call for the GTK/GLX driver so that it actually works now
  • Fixed rendering in headless mode (now it just doesn’t render anything, rather than crashing)
  • Added ability to terminate after a specified period of emulated time (useful for time trials)
  • Added many more SH4 test cases