View Full Version : My P3's are now slower than a P2 with v2.20 client!!
Stromkarl
12-13-2004, 11:16 PM
I have installed v2.20 on all the machines I have working on SOB. I have noticed a 2x-3x speed increase on P4's, a 2x speed increase on P2's, and an even or slightly lower speed for the P3's. I have a P2-400Mhz now going faster than almost all of my P3's. Is there any way that the P3's optimizations could have been missed or not activated in compiling?
The P2 has 128Mb of memory on a Win98SE machine. All the P3's are between 500Mhz - 1Ghz with 256-512Mb of memory on Win 98SE systems. Once restarted, the P3's take forever to climb to the peak speed - quite a bit longer than the v1.25 client. The P2 and P4 restart time to max speed is very quick - like the v1.25 client.
Has anyone else noticed the same issue? Can anyone help or at least look into this?
Stromkarl
Member of TeamRetro
Mystwalker
12-14-2004, 04:28 AM
Well, the rewritten FFT routines seems to be ~33% slower on P3 CPUs: http://www.mersenneforum.org/showthread.php?t=3387&page=1&pp=50
I don't know if v2.2 of the SoB client branches to the old FFT code once it detects a P3/P4, though.
Ken_g6[TA]
12-14-2004, 02:59 PM
My PIII-type Celeron 667 is somewhat faster with 2.2 than with 1.2x.:scratch:
Hmmm... Speedups have been reported on Linux, and the Celeron is on Win2K. Maybe it's something about Win98?
I've installed this on a selection of 600-850MHz P3s and a 1GHz Mobile P3 and all of them sped up by ~50% or more.
jjjjL
12-14-2004, 10:41 PM
Originally posted by Mystwalker
Well, the rewritten FFT routines seems to be ~33% slower on P3 CPUs: http://www.mersenneforum.org/showthread.php?t=3387&page=1&pp=50
I don't know if v2.2 of the SoB client branches to the old FFT code once it detects a P3/P4, though.
The FFT routines are slower yes, but the new IBDWT algo lets the client opperate on a half size FFT. Of course, there are only a limited number of actual FFT sizes so a few tests may fall into zones where the old version may have a slight advantage.
Overall, it will be faster with the new version.
Cheers,
Louie
Mystwalker
12-15-2004, 07:54 AM
Originally posted by jjjjL
The FFT routines are slower yes, but the new IBDWT algo lets the client opperate on a half size FFT. Of course, there are only a limited number of actual FFT sizes so a few tests may fall into zones where the old version may have a slight advantage.
Overall, it will be faster with the new version.
Good to hear! :cheers:
prime95
12-15-2004, 06:57 PM
Originally posted by jjjjL
The FFT routines are slower yes, but the new IBDWT algo lets the client opperate on a half size FFT.
Actually, the IBDWT only gives mersenne numbers a half-sized FFT. For k*2^n+1, the larger k is, the larger the FFT size will be. For SoB, most of the k/n pairs you are testing will use the same FFT size as the previous version. IBDWT does give you the mod k*2^n+1 for free which was a fairly expensive operation in the previous version.
For P3's the new version could be slower. I was expecting the gain from the free mod to about equal the loss from the slower FFTs. YMMV.
Keroberts1
12-15-2004, 07:57 PM
is the slower fft speed because they haven't been optimized as muc has the old code? does this mean that the speed could be increased even more?
prime95
12-15-2004, 09:38 PM
The IBDWT changes required a near complete rewrite of the x87 FFTs. No small task. The old FFT code used a really funky memory layout that the Athlon architecture did not like. Also, using a new simpler memory model made it much easier to do the rewrite.
To make a long story short, yes one could write another set of x87 FFTs using the old funky memory model and gain another 30%. However, the P3 architecture is dead. It just isn't worth the time. Sorry.
Keroberts1
12-15-2004, 09:49 PM
oh i don't care if it'd only help the P3s i just meant anything that would improve the overall effect. I thought they were saying the fft was slower on everything. Everything i have is an athalon. I'm only interested in if there are anymore possible improvements for them. P3s shouldn't be PRPing anyways a test takes os long on them that it'd just be better (in my opinion) for them to be sieving.
Powered by vBulletin® Version 4.2.4 Copyright © 2025 vBulletin Solutions, Inc. All rights reserved.