Paul, have you considered a "multi-path" version, with CPU-specific paths? Or is it not worth the effort?
Have tried alpha=2.000, 2.500, 3.000 and 2.500 is still the fastest.
Moo that's normal, see my previous post.
OK, first things first: your CPU is more important than your operating system. Also, rather than just saying that it is x faster, it is important to know that it is x faster than y - 4000 better than 2000 is more important than 4000 better than 400000.
I do my best to improve performance, but you must note that while algorithmic improvments should see things improve across all processors, there will be some tweaks that are processor specific. Since I did the development yesterday on an Athlon, Athlon users will get the most benefit from it.
The Alpha may well need changing, and I recommend that you experiment to find the best value for your system.
Oh, and I did some more work last night... there might be a 1.31 soon, but I will compare it with the 1.30 with on a PIII first.
Regards,
Paul.
Paul, have you considered a "multi-path" version, with CPU-specific paths? Or is it not worth the effort?
Have tried alpha=2.000, 2.500, 3.000 and 2.500 is still the fastest.
Moo that's normal, see my previous post.
Absolutely brilliant increase.
My XP 1800+ went from 178kp/s to 240kp/s
So an amazing 35% increase.
Well done!
EDIT: Actually after running the worker thread at normal and leaving the machine alone it got up to 249kp/s [alpah 3.5] making it a 40% increase!
Last edited by Xeltrix; 05-11-2003 at 05:39 AM.
Excelent work Paul. For P3-850 best alpha seems to be about 2.0.
Any chance of an updated console version?
Thanks,
Mike.
Weird. An alpha value of 3-3.2 still works fastest on my P-III (800 Mhz)
Oh yeah, the improvement for my PC (4000 p/sec) is from 93500 p/sec to 97500 p/sec.
BTW, does the p range you're testing have any effect on the optimal alpha values?
Hi all,
OK, in answer to some questions, I am trying to keep to a single codebase for all processors. If an algorithm is fast on a PIII it ought to be fast on an Athlon. Fortunately I have got access to both so I can verify it.
With that in mind, yesterday on the Athlon I produced 4 different versions that all ran at about the same speed, and tried them out today on this PIII. The results are:
1.30 release 42.9 kp/sec
1.30 release with smaller cache requirement 47.3 kp/sec
1.31 48.5 kp/sec
1.31 with smaller cache requirement 53.4 kp/sec
So I will now release 1.31 with the smaller cache requirement. Athlon users might see a slight improvement; other users will definitely (I hope) see an improvement.
This also includes the console version.
Regards,
Paul
I would have thought not. The important factors in deciding alpha are the range of n, and how many k values are being sieved together.Originally posted by Moo_the_cow
BTW, does the p range you're testing have any effect on the optimal alpha values?
By the way, this new release may well need some experimentation to find the best alpha value.
Regards,
Paul
Paul,
Great job again. But there seems to be something wrong with the console version. On this PC (XP 2100+), I see the following
SobSieve 1.31 ~280Kp/s
SobSieveConsole 1.31 ~70Kp/s
Thanks,
Mike.
I can confirm the same problem on my XP 1800+
GUI: ~250Kp/s
Console: ~59Kp/s
Console also takes an age to start up. 3-5 secs.
I have managed to get some more speed out of it... on this Athlon the rate goes from 258 kp/sec to 300 kp/sec. Which isn't too bad
Mike/Xeltrix, the console rate problem was just a reporting problem - it worked out the rate incorrectly (I report the rate a quarter of the time now... but I didn't multiply the relevant variable by 4 in the console code). That has been fixed as well.
Be careful not to surpass the speed limit!
Damn, that new versions are blazing fast! 161kp/sec for a 900 MHz Duron? Is was "only" a bit more than half that much with 1.28 - which was the current version 3(!) days ago!
Maybe you should do how Mr. "Scotty" Scott does: Lower expectations and than - Wham! - surpirse everyone. Hm, maybe you already do...
One strange thing:
I got 19 factors within a range of 2.1G around 14T - and all those factors are indeed valid! Did you make sure that those new versions do not find any additional (but nevertheless valid) factors compared to the prior versions? If yes, it may be pure coincidence...
Ah, before I forget it:
FANTASTIC WORK, PAUL!
Paul I love you!!! Haven't done extensive testing (other apps running, only tested in Windows 2000), but I got a ~50000 p/s increase (from 165 to 215 kp/s) versus 1.30! The only problem I see it takes 190 Mb of memory , but it's worth .
The console version crashes on my computer.
Paul,
I have some concerns about SobSieve 1.31. It is producing way too many factors. On my test PC, where the sob.dat file has not changed (still the alternate 300K-20M file), in an 8G range 20.200T-20.208T it found 35 factors. This is much too high, in this region even with a "lucky" 8G I would expect no more than (say) 15 factors.
I am able to parse the results with my Sieve Scoring program. Of the 35,
13 are unique (high, but OK - must be correct)
1 is duplicate (about what I'd expected)
21 are excluded (not good!)
Of course when submitted using the web form, all seems OK - "35 of 35 verified ... 35 new...".
However the 21 "excluded" results simply aren't present in the sob.dat file, so shouldn't be being generated.
On the upside - the speed increase is genuine, and I guessing that no genuine factors are being missed.
Cheers,
Mike.
That is odd... you see, for my testing I remove one particular test so that it will report divisors that are not in the bitmap. That way, I get a lot more results reported. All of the results are valid, of course, it is just that they have already been removed. But I checked and double checked and the version on the web doesn't do that... so perhaps it is coincidence. Ah, was that 1.31 (I have just seen Mike's comment)? Maybe I didn't reinstate the test in 1.31... but use 1.32 anyway, it is faster EDIT: I Just tested 1.31, and it definitely reports all divisors regardless of whether n is in the bitmap or not.Originally posted by Mystwalker
Be careful not to surpass the speed limit!
I got 19 factors within a range of 2.1G around 14T - and all those factors are indeed valid! Did you make sure that those new versions do not find any additional (but nevertheless valid) factors compared to the prior versions? If yes, it may be pure coincidence...
Incidentally, I had a real nightmare with the software just before releasing it. I was trying it out with some test numbers and it failed to find a divisor for one of the numbers. It took me a long time to see that at one point in the code I had used the register "edx" rather than the register "edi"...
By the way, I forgot to mention that this version is a heavy RAM user. 174 Mb... let me know if that is too much, and I can add an optional flag to the software to make it use less.
Regards,
Paul.
I think that option would be useful.
I was surprised to see such a high Mem Usage for a fresh system. But strangely, the usage doesn't show up in the Task Manager...
It does if (in Task Manager) you go to 'view' -> 'select columns...', then check 'Vertual Memory Size'.I was surprised to see such a high Mem Usage for a fresh system. But strangely, the usage doesn't show up in the Task Manager...
I agree it would be a nice option. I'm not sure all my sieving PCs have enough virtual memory setup right now. On the other hand, if less memory means less speed, I'd be better off just increasing the VM size.
BTW Fantastic job Paul.
And preliminary news on v1.32, it seems to be producing less factors, so the problem with 1.31 seems to be resolved. We'll see from the sieve scores over the next few days who has been using 1.31
193 MByte... neat...It does if (in Task Manager) you go to 'view' -> 'select columns...', then check 'Vertual Memory Size'.
ATM I'm running the sieve on two systems sporting 512 MB RAM - but tomorrow two 256 MB machines will run again...
I've got Task Manager showing 1700K for V1.32 and 2200K for V1.31!
Joe O
Mine says 3.5 MB. But when you add the column "Virtual Memory Size", you see the difference...I've got Task Manager showing 1700K for V1.32 and 2200K for V1.31!
I don't get a chance to report on anything! All very quick. (Siever and responses!)
So now 1.32 is 100Kp/s faster than the siever I was using 4 days ago! Mindblowing.
Also noticed 1.31 was suddenly giving me an influx of factors. It is gone now...
So all in all an amazing job Paul!!!
Now to go reserve (what used to be) insanely large amounts of p for me to sieve!
wow, i don't watch the forums for 12 hours and Paul has already released two new sievers. nice work.
I didn't try ver 1.31 since it ignores the bitmaps but here is what I see speedwise for v1.32.
I installed it on a t-bird athlon 1.4Ghz(100MHz fsb) and it went up from 220k to 245k. Changing alpha from 2.0 --> 3.0 increased it to 255k. This was from v1.30 to 1.32. The console version appears to give almost no increase in speed... nothing I can notice.
I am also very impressed w/ its performance on a Duron 600 I installed it on.
ver rate alpha
v1.28 56k a=2.5
v1.30 80k a=2.5
v1.32 95k a=2.5
v1.32 100k a=3.5
So the smaller cached Duron seems to enjoy a big increase. Makes me wonder how a P4 might do now.
I saw someone mentioning that the sieve scores might get skewed because of v1.31 since it will report extra factors. I'll look into it. Even though the verifier accepts them, that doesn't mean it saves the factor in the db. I'm pretty sure it checks to make sure there's a proth number in the database first and I only added those for numbers that didn't have factors below 1G. So while it's possible a few might get though, it's also likely that the database wouldn't even save them. I'll go though the code and let you know. Mike, for now you may want to add a check to the factor stats to see if the factors are for numbers in the SoB.dat. You may want to discount those that aren't. I can find you a copy of the original dat file if you want it or you can exclude the check for numbers below p=210G which is about where the current dat file was sieved to when it was created. I think a few numbers above it were also done so I'd just exclude all p < 1T from such a check for now.
-Louie
OK, so that is the "working set". The total virtual memory is 34228K for V1.31 and 56612 for V1.32.Originally posted by Joe O
I've got Task Manager showing 1700K for V1.32 and 2200K for V1.31!
Joe O
OK, I had a brainwave tonight and I worked out how to get rid of loads of the extra memory that it is using. This probably wont make it any faster, but it will make it a better friend to the other programs that you are running.Originally posted by MikeH
It does if (in Task Manager) you go to 'view' -> 'select columns...', then check 'Vertual Memory Size'.
I agree it would be a nice option. I'm not sure all my sieving PCs have enough virtual memory setup right now. On the other hand, if less memory means less speed, I'd be better off just increasing the VM size.
Hey, I am just reacting to the great ideas that people are having on this forum. There was a really interesting conversation with mklasson, Joe O, MikeH, wblipp, and others, about using higher powers of t a few weeks ago. And basically I am implementing ideas that came out of that along with trying to reuse calculated values as much as possible.
BTW Fantastic job Paul.
*chuckle* Please move to 1.32 - it is faster!
And preliminary news on v1.32, it seems to be producing less factors, so the problem with 1.31 seems to be resolved. We'll see from the sieve scores over the next few days who has been using 1.31
Regards,
Paul.
It's no problem, I already use my own 1G sieved sob.dat file, so I can catch anything that would have fallen out before 1G. Anything that would have fallen out between 1G and the current sieving ranges will just be clasified as yet more duplictes.I'll go though the code and let you know. Mike, for now you may want to add a check to the factor stats to see if the factors are for numbers in the SoB.dat. You may want to discount those that aren't.
Prior to today, only Nuri and myself had submitted "excluded" factors, and that was due to me not constructing the first alternate sob.dat file correctly. My guess is there will be another half dozen with excludeds once todays stats are ready.
Hi all,
Here is a 1.33 release. This uses a lot less memory than 1.32, but is just as fast - in fact, it might be a tad faster (but the difference is negligible).
Regards,
Paul.
Again, many thanks Paul.
Preliminary findings:
On AMD XP2100+ 1.32 is faster than 1.33 by ~5%.
On P3-850 1.33 is faster than 1.32 by ~5%.
Could be memory related - AMD has 768M, P3 has 256M.
"Pick and mix" sounds good to me.
Mike.
EDIT: Just to clarify, these numbers refer to the console version.
Last edited by MikeH; 05-13-2003 at 09:11 AM.
Wow, very nice.
PII-350 alpha=3.5 (alternate sob.dat) 1.30 36k 1.33 48k
Watch out 30000 here we come.
great job Paul
1.28 203kp/s
1.33 254kp/s
on a P3-Celeron(Tualatin) with 1500 MHz, 128 MB SDRAM, alpha=2.5
1.32 didn`t work (system crashed after some minutes)
Thanks Paul! 1.33 console is a bit faster than 1.32 GUI (~225000 p/s), while 1.33 GUI is as fast (or as slow) as 1.32 console (~210000 p/s). That's with 300 k - 20 M SoB.dat and Windows 2000.
Priwo, what speeds gets that Tualatin on PRP testing? Here's my problem. Could you reply in that topic, please? Thanks!
ok, this is great, but is there any chance for a Linux version?
We non-windows folks are not feeling the love.
Should I bother to see if I could run this under Wine?
Tualatin-Celeron 1500 doing prp-test:
n=3673052 k=21181
cEMs/sec=155.000 time left (for full test)=91hours 5minutes
While V1.33 gives me the best results for n<20M for p around 17.3T, V1.28 gives me the best results for n <3M for p around 3.3T
113232 p/sec V1.28
108353 p/sec V1.30
92915 p/sec V1.32
90278 p/sec V1.33
Celeron/450 Win NT 4 3.3T for the p range
This is for lower range sieving, i.e. n < 3M
68157 p/sec V1.33
67575 p/sec V1.32
47477 p/sec V1.30
46458 p/sec V1.28
PIII/500 Win98SE 17.3T for the p range
This is for the full range sieving, i.e. n <20M
Joe O
Yes, definitely!Originally posted by OberonBob
ok, this is great, but is there any chance for a Linux version?
We non-windows folks are not feeling the love.
Should I bother to see if I could run this under Wine?
I have been thinking about a Linux version, but haven't had the time to get it done properly. If there are any volunteers out there who have got time and know gcc and embedded assembler (like this:
int x (int y)
{
unsigned halftoMinNum;
asm("shr $1,%edx ");
asm("rcr $1,%eax ");
asm("movl %eax,halftoMinNum");
}
)
then let me know. The pain is that the assembler all needs to be changed round - the VC++ version of the above is
int x (int y)
{
unsigned halftoMinNum;
_asm
{
shr edx, 1
rcr eax, 1
mov dword ptr halfto MinNum, eax
}
}
Regards,
Paul.
Perhaps an automated intel->at&t converter would be least painful?Originally posted by paul.jobling
The pain is that the assembler all needs to be changed round
A quick search yields:
http://www.delorie.com/djgpp/mail-ar...06/06/05:48:34
Might just work. It seems like you'll have to write a filter that converts "_asm{" to at&t as well, makes a string of the instructions, and so on, but that should be relatively easy. Of course, a more thorough googling might come up with a better tool.
Regards,
Mikael
Paul,
A few quick questions. When I'm using SobSieveConsole, how do I set the alpha? When I use SobSieve and change the alpha, the value is remembered. Is this stored value then used by SobSieveCosnole?
I have taken a quick look through the registry, and uless the alpha is stored as part of 'FriendlyCacheCTime', then I can't see anything obvious.
What would be nice is to have the ability to set the alpha as a command line parameter (or config file) for SobSieveConsole.
Thanks,
Mike.
Mike,Originally posted by MikeH
Paul,
A few quick questions. When I'm using SobSieveConsole, how do I set the alpha? When I use SobSieve and change the alpha, the value is remembered. Is this stored value then used by SobSieveCosnole?
I have taken a quick look through the registry, and uless the alpha is stored as part of 'FriendlyCacheCTime', then I can't see anything obvious.
What would be nice is to have the ability to set the alpha as a command line parameter (or config file) for SobSieveConsole.
Thanks,
Mike.
use
-a=<value> on the command line, for example
SoBSieveConsole -a=3.0
You can also use
-o=<file name>
to tell it the name of the status file to use. This is useful on multiprocessing systems or clusters to enable one SoB.dat file to be shared by many instances of the program.
Mikael,
Thanks for that link - it looks good. Some small modifications are required, but not too much.
Regards,
Paul.
Many thanks Paul. Very useful.
From Mike's stats page it appears that SoBSieve v.1.33 is producing extra factors.
On Mike's stats, I have 1 excluded factor, and I didn't use versions 1.31 or 1.32.
Paul (or anyone else), do you know what is the reason for this?
The excluded factor (in results.txt format) isOn Mike's stats, I have 1 excluded factor
12620584122611 22699 3808774 2537 NULL
... and for anyone that's interested, here's a full list of the factors that are spat out by the scoring as excluded.
Last edited by MikeH; 05-16-2003 at 09:16 AM.
is it possible for one value ot eliminate several different N values or maybe Nvalues for different Ks? I don't need the technical info on this justa yes or no but als oif the answer is yes does the sieve program check for this?