Archive for January, 2009

January 28th, 2009 by nkeynes
Testing with Intel’s icc compiler
Posted in Development

Out of sheer curiousity, I thought it might be worth seeing how icc performs on lxdream – short answer, not too shabby at all. All tests otherwise with the same command options, best of 3 runs:

Compiler 5-second core runtime Improvement
gcc -O2 3.10s N/A
gcc -O2 -fprofile-use 2.96s 4.6%
icc -fast 2.96s 4.6%
icc -fast -prof-use 2.73s 12%

Profile runs using profile generated for the same test. 5% is kind of meh, but 12% on the icc profile build… ok that’s pretty nice. I will probably have to look at generating some decent general purpose profile traces for production builds

In any event, I’ve added support for building with icc, for the benefit of the 2 people who actually have it ^_^.

Otherwise I’ve finally got some very basic UTLB test cases in now, and fixed a number of bugs that turned up – it seems to be at least as stable as the old version was by now (which actually still had a few bugs too incidentally…)

January 14th, 2009 by nkeynes
Memory system rewrite
Posted in Development

The memory system rewrite is merged now – there are a few things I’m not completely happy with yet, and the old page_map isn’t quite gone completely, but on the whole it’s simpler, faster, and much more consistent. More importantly perhaps, UTLB translation is now _very_ cheap (3-instruction overhead[0] for OSes using the typical 4K page) – linux now boots and runs at full speed on my systems. There’s probably a few lingering issues and I’m still working on a good test suite for it[1], but most bugs are likely to be in things that never worked before anyway.

I also have some work-in-progress on the operand cache (nominally the original reason I started doing the rewrite…), but it’s still showing a bit more of a performance hit than I would like (10-15%). So currently I’m thinking this will probably wait for the next version before being fully integrated and finished. It does need to be done eventually though for correctness reasons, since the SH4 doesn’t ensure cache-coherency in hardware.

In any case, once the MMU tests are done I’ll get back on the translator upgrade. It’s looking at this stage like 0.9.1 will end up being almost purely a performance release, but since it should be at least twice as fast overall as 0.9, no one is really going to complain about that, right? ^_^

[0] We might be able to special case sdram access and get that case down to 0 instructions, but leaving that aside until after the op-cache is done…
[1] Annoyingly enough, there doesn’t seem to be a good way to recover from TLB multi-hit resets on the DC, which makes it a little hard to test that aspect of things… Even more annoyingly, the DC BIOS _does_ vector manual resets through 0x8c000018, but not any other reset.