PDA

View Full Version : Sieve Client Thread



Pages : [1] 2 3

paul.jobling
01-23-2003, 12:40 PM
Well, all this competition has proven productive ;) I have produced a version that is 45% faster than the 1.10 version at p=150G, though you may get different results based on the platform and the p values that you are using. However, I have tried to be as friendly as possible to different processors.

At the end of the day, myself and Phil both know that 99% of the time in the code is being spent doing a couple of things, and you can only optimise those things so far. I would expect that in the limit, the only difference between our software on the same platform would be the OS overhead. Which is probably why his Linux version is faster than the Windows version!

Regards,

Paul.

FatPhil
01-23-2003, 01:33 PM
Originally posted by paul.jobling
Well, all this competition has proven productive ;) I have produced a version that is 45% faster than the 1.10 version at p=150G, though you may get different results based on the platform and the p values that you are using. However, I have tried to be as friendly as possible to different processors.

At the end of the day, myself and Phil both know that 99% of the time in the code is being spent doing a couple of things, and you can only optimise those things so far. I would expect that in the limit, the only difference between our software on the same platform would be the OS overhead. Which is probably why his Linux version is faster than the Windows version!

Regards,

Paul.

I think it's cooperation works best when both brains are truly primed for tackling the problem, cobwebs dusted off and all that. I think we're both at ramming speed now, aren't we? The amusing thing is that I know I can make my latest version much faster, but a quick attempt failed curiously, and I wanted to draw a line under it tonight for Sander.

On the website is version 009, which is over times the speed of 006 (my initial release). That could potentially make it 5 or more times the speed of the old SoBSieve that people were using only a few days ago. My Duron 900 does 10M in 325 seconds., or >30750p/s.

Solaris, unfortunately failed to take to the latest improvement, and it got horribly slower. I might make a 008+ or a 009- for solaris, which would have only 1 of the 2 optimisations that I put in between 008 and 009, though.

Don't worry folks, this "competition" isn't a damaging one at all, I simply come up with my best ideas when under pressure, and I want to maintain that pressure just for another day or two. I have one more path I _must_ follow, and I'm just a bit focussed on it at the moment. To misquote Homer Simpson - "can't talk, coding".

Of course my code, when it reaches its final state in C would benefit from some asm at the important places, but I'm not sure I'm up for the job. Paul, however, is a pretty nifty asm coder, so the conclusions are obvious.

(I remember when we were in that pub with the biltong in Reading you tried to tell me that I was an alright asm programmer, Paul, and I denied it. Well here's the proof - I can't asm-ify this for toffee, even though it shouldn't be hard at all!)

So, Windows and Linux dudes, grab yourself a 009, and report back, please :-)

Phil

Alien88
01-23-2003, 01:45 PM
Remember, please use this thread for discussion of the client. Use the other thread to coordinate your sieving.

RangerX
01-23-2003, 02:00 PM
OK, so with SoBSieve 1.11 I should NOT change the alpha from 1? Or was that 1.10 specific? >.<

ceselb
01-23-2003, 02:19 PM
Originally posted by RangerX
OK, so with SoBSieve 1.11 I should NOT change the alpha from 1? Or was that 1.10 specific? >.<

You could always change it if you wanted. The reminder was to get people to not continue on 3.2. 1 should be near optimal right now afaik.

Btw, my PIV isn't getting the stated 45% increase, but instead around 6%. Not that I'm complaining or anything. :cheers:

smh
01-23-2003, 03:06 PM
The amusing thing is that I know I can make my latest version much faster, but a quick attempt failed curiously, and I wanted to draw a line under it tonight for Sander.

Phil, please don't let me stop you, it's just an old P3 that is going to run until someone reboots it (probably somewere next week), so it's only of small use for the project at all. (And i'll have access to it until tomorrow evening ;-))

Actually i prefer SoBSieve. I can set that one up as a windows service. (The pc automatically switches on every evening incase someone turns it off).

I'll do some testing tomorrow.

FatPhil
01-23-2003, 03:14 PM
Originally posted by smh
Phil, please don't let me stop you
...


I just think it would be stupid for me to spend a day trying to make this optimisation again, and to fail again, so I'm making sure that that with which I am happy is available.

009 is between one and a half and twice as fast as 008 on my Durons, I thought it was worth getting it out there sooner rather than later.

Phil

kmd
01-23-2003, 05:35 PM
Not that it particularly matters, but the newest version of SOBsieve is giving me much larger numbers for sec/n/k. Before I was getting around 7 or 8 when i wasn't doing other stuff, and now it's up in the 1400's. Why the big change?

Also, SOBsieve is set up to automatically highlight all factors in the window everytime you re-open the window. Is there some way this could be disabled?

Mystwalker
01-23-2003, 05:57 PM
Phil:
Got a 75% performance increase with my P3-m and even a bit more with the Duron. Well Done! :cheers:

One suggestion:
Could you make it possible to make the batch file customizable? Perhaps with a parameter that's printed in front of the program call.
Then one could use a "start /low NbeGon[...]" command...

ceselb
01-23-2003, 07:05 PM
I did a little timed run to compare the speed of the sievers. Range 25 - 25.01G.

SoBSieve 1.11 12m03s
NbeGon_009 7m30s

Conclusion: PIV should use NbeGon!

Mystwalker
01-23-2003, 07:11 PM
My suggestion above is more or less obsolete!
I just created another batch file containing "start /low sob.bat"...

But maybe you could add a key (combination) to end the program after having updated the start point in the batch file?
So no work is lost when one stops the program.

ceselb
01-23-2003, 08:09 PM
Did another test, on an old PII this time. Range 25 - 25.005G.

SoBSieve 1.11 14m09s
NbeGon_009 7m39s

Conclusion: PII (and probably P3) should use NbeGon!

jjjjL
01-23-2003, 08:27 PM
My Athlon 1.33GHz:

SoBSieve 1.11 (a=1.0) --> 25,000 p/sec
NBeGon 009 (d=2.9) --> 55,000 p/sec (!!)

Wow. My results may not be universal because my range is low but those speeds are amazing. Good job Paul and Phil! I don't think people (even those who have seen all the recent improvements in the public sieves) really grasp just how amazing these speedups are!

Just to give everyone a little perspective, lets look at the previous speed king: NewPGen 2.70. Don't get me wrong, it's a GREAT program and much faster than other sieves, but the ability to sieve all 12 k's at once is a real big deal. I just benchmarked p=~14billion for k=27653 from n=3-20million. This runs at 11,500 p/sec. Now, correct for the fact that you have to sieve each k seperately and you're down to an average total speed of 960 p/sec. That doesn't even take into account the fact that most of the sieving was done for SB with an out of date version of NewPGen (v2.40 i think) that ran about 40% slower. This means the previous effective rate was (assuming I had used this computer):

580 p/sec. :eek:

I did eventually upgrade a few months ago but still... to consider that we are now capable of runing at 100x the previous speed is insane. My workstation is now capable of doing in a day what I normally did on the U of M supercomputing cluster in a week! :)

Now add to the 100x speedup the fact that it is a public search and we have effectively 500-1000x the previous sieving power. Sit back and think about that for a minute and just try and tell me that you don't love Paul and Phil for all their hard work. :D

-Louie

dmbrubac
01-23-2003, 09:01 PM
... we have effectively 500-1000x the previous sieving power.

:|ot|:

Please don't hurt me Mr. Moderator, but...

Is there similar optimization room in the SoB client? I won't pretend to understand the math, but it just stands to reason.

Instead of my current production sitting at about 300, it could be at 30,000! Instead of the entire production sitting at about 90 M, it would be 9 G.

Also considering that these enhancements were done by two people over just a few days (?) I believe it is worth the effort.

Alien88
01-23-2003, 09:18 PM
Originally posted by dmbrubac
:|ot|:

Please don't hurt me Mr. Moderator, but...

Is there similar optimization room in the SoB client? I won't pretend to understand the math, but it just stands to reason.

Instead of my current production sitting at about 300, it could be at 30,000! Instead of the entire production sitting at about 90 M, it would be 9 G.

Also considering that these enhancements were done by two people over just a few days (?) I believe it is worth the effort.

There already has been a lot of optimization done to the code.. if you only knew how slow the client was when the project first started.

I'm sure there's room for some improvement.. there always is.. but I'll let Louie field the rest of the answer.

Hawk
01-23-2003, 09:39 PM
SoB uses George Woltman's FFT libraries, which are insanely well optimised (its almost all in very carefully hand-crafted assembler). They have been worked on over a number of years (about 10 now? Maybe more?) for the Mersenne Prime project. Consequently, I don't think the project will see any major increase in speed. As Alien mentioned there already was a huge speed increase when the Woltman libs first started to be used.

Mike Bell.

P.S. Great to see Paul and Phil both working on this, Paul's code was already a few times faster than mine before the last round of optimisations..... :notworthy

jjjjL
01-23-2003, 09:46 PM
there is actually someone working on the asm core of sB. I don't know how public he wants his effort to be or I'd post more, but there is definately some effort directed to it.

Any asm folks out there who need a happy-fun ;) weekend project are welcome to email me for the code. this is not a good "learning" or "first stab" at asm type project... but if you think you have a good command of x86 asm, drop me a line and I'll hook you up.

-Louie

Alien88
01-23-2003, 09:57 PM
Phil's client (NbeGon): (OpenBSD p3-500)

d=1
version: 008
9805

version: 009

---
Speed testing:

13140 d=1
14445 d=1.5
14991 d=1.75
15283 d=2
15531 d=2.5
15729 d=3
15772 d=3.5
15801 d=4

--
Alien88

frmky
01-23-2003, 10:36 PM
Wow!!! NbeGon_009 runs at 73,000 p/s on my Athlon XP 1800+! And this is pure, portable C code! I can't wait to see how high this goes when further optimized and the core routines are converted to ASM. 'Tis already almost 6x faster than the original SOBSieve. Go Phil and Paul!!

Greg

dmbrubac
01-23-2003, 11:06 PM
Originally posted by Alien88
There already has been a lot of optimization done to the code.. if you only knew how slow the client was when the project first started.

I'm sure there's room for some improvement.. there always is.. but I'll let Louie field the rest of the answer.

all righty then.

jjjjL
01-24-2003, 12:55 AM
Phil,

It would be convienent if symbols could be used in the range settings on NBeGon.

i.e.

./NbeGon_009_x86 -s=SoB.dat -d=2.0 -p=200G-210G &

instead of

./NbeGon_009_x86 -s=SoB.dat -d=2.0 -p=200000000000-210000000000 &

It would help me do automated execution on my cluster. Thanks.

-Louie

*EDIT* - I already found a work around. I don't need this feature, although I think it might be good for people.

FatPhil
01-24-2003, 06:21 AM
Originally posted by jjjjL
Phil,
./NbeGon_009_x86 -s=SoB.dat -d=2.0 -p=200G-210G &


Good suggestion. v010 will be out tomorrow. It promises to be nearly 50% faster (that is a 3:2 ratio) than v009. (I made one lousy typo when doing the optimisation yesterday).

I'm praying that this means that someone will post a 100,000p/s figure, which I think will make me quite happy.

I'm heading off to HUT's CSC now to compile native versions for every architecture I can find there. So briefly it will run on the 37th most powererful supercomputer in the world!
Then I'm off to a beer festival. Exactly what happens after that noone knows, but I'm sure Paul can guess.

What I'll do for the 24 hours between now and release is run 010 on a vast range that someone else has already tested with SoBSieve, so that everything can be double-checked.
Given that the most factors come out at low P, I'll simply run from 10G upwards, and gather a huge number.

Phil

FatPhil
01-24-2003, 07:11 AM
Originally posted by Alien88
Phil's client (NbeGon): (OpenBSD p3-500)


The first bug report is in! However it is so far only reproducable on alien88's OpenBSD's machine with version 009.

The following I have tested just now and do not have the error spotted:
OpenBSD 008 (on Alien88's machine)
Linux 009 (tested on Duron, and PPro)
Windows 009 (tested on Duron)
Alpha 009
Sparc 009(-FP16)

The difference between 008 and 009 is the replacement of the baby-step integer code with FP equivalent, which means that it _could_ be chip-specific rather than OS/build specific. The OpenBSD code was compiled with gcc 2.95, which is old, and not exactly bug-free, so I hope it's a compiler problem!

Anyway, if someone can check the following range
-p=280000900000-280001000000
for the existance of
280000989923 | 21181*2^15223028+1
on either other OpenBSD systems, or on a P3 using the linux/windows code, I'd be grateful. (a P4 verification would be nice too).

Thanks,
Phil

nuutti
01-24-2003, 07:51 AM
I found that factor using PII500 and win2000 and 009.

Yours,
Nuutti

olaright
01-24-2003, 08:31 AM
Found it on PIII 450, (using version 9)

Cheers,

ola

FatPhil
01-24-2003, 08:41 AM
Originally posted by nuutti
I found that factor using PII500 and win2000 and 009.


Thanks Nuutti - I've just worked out what the problem is.
(Literally 30 seconds ago - so yes, I'm not at HUT, I'm debugging instead, but will be at the beer festival - fancy coming? Gallows Bird, Meritulentie 30, in Niittykumpu, http://www.kolumbus.fi/gallows/ra03festival.html).

It seems that under all x86 OSes apart from OpenBSD the default processor FP precision state is extended precision (80-bits), rather than double precision (64-bits), and I wan't manually setting the state myself. I have hardcoded some magic numbers that assume 80-bits on x86 and 64-bits on other processors, and therefore OpenBSD was b0rking up my rounding completely. When I've worked out what bits to diddle with in the FPU control word, I should be as right as rain.

Because I'm not at HUT, I'm doing a shorter test 10G-10.1G for 010 (I've finished in 5950s on a 533MHz machine, I'm now waiting for SoBSieve, and we have 90 out of 90 agreement so far :-) ), and I'll get this latest version out later today, _before_ going to the beer festival.

Phil

nuutti
01-24-2003, 09:42 AM
I have to miss but may be some other time during this winter/spring ?
(I will move to Chicago,IL during next summer)

Yours,

Nuutti

paul.jobling
01-24-2003, 09:58 AM
Hi all,

Okay, here is the latest version. This is more cache-friendly, so it should be faster as it spends less time waiting for the process to get memory into its internal cache; also I have tried to keep the number of instructions to a minimum,which should make it faster. It is certainly faster on this machine.

Also, I have removed the message box on startup. The range is now in the title bar - thanks for the suggestion!

Oh, and the rate now shows the number of seconds between removal on a per-k basis - so if there are 10 k's, and we are removing one n every 2 seconds, then the rate is 20 seconds (on average) to remove an n for each k. This is a better measure.

Regards,

Paul.

dmbrubac
01-24-2003, 11:08 AM
Look at Analyzing Processor Activity (http://www.microsoft.com/windows2000/techinfo/reskit/en-us/default.asp?url=/windows2000/techinfo/reskit/en-us/prork/pred_ana_eznf.asp)
for more info on process and thread priority. The combinations and permutations give a total of 31 priority levels.

Also, before the two threads were split, I responded to Phil about NbeGone and Sob not sharing CPU time well. I found if I ran SoB at normal and NbeGone at Below Normal, they each received 48-49%

I know it's not recommended to run SoB at Normal, but too bad.

FatPhil
01-24-2003, 11:15 AM
Originally posted by paul.jobling
Okay, here is the latest version.

Snap.

My 010 has just passed a very lengthy test, and is available from http://fatphil.org/maths/sierpinski/bin/ . One and a half times the speed of 009 on linux and windows, as hoped.

The dimension parameter will now almost certainly want to be higher - I'm using -d=8.0 for optimal results on my machine.

That's all for today. Beer has priority over coding...

Phil

nuutti
01-24-2003, 11:35 AM
Nice job !
I tested range 150,000,000,000 -> 150,001,000,000
and it took 48 sec using your sieve and 150 sec using Paul's
Rates : Phil 21,000 and Paul : 9,100

I have PII 500.


Nuutti

ceselb
01-24-2003, 12:01 PM
If anybody need a test range here's one:
25 - 25.34G. Done by SoBSieve 1.06, so it should be accurate.

frmky
01-24-2003, 12:37 PM
Originally posted by FatPhil
Snap.

My 010 has just passed a very lengthy test, and is available from http://fatphil.org/maths/sierpinski/bin/ . One and a half times the speed of 009 on linux and windows, as hoped.

The dimension parameter will now almost certainly want to be higher - I'm using -d=8.0 for optimal results on my machine.


On my Athlon XP 1800+, the optimal d parameter is -d=4.5, and there it checks 97000 p/s. I'm sure someone with a faster Athlon will report your 100,000 p/s number. :)

Greg

Alien88
01-24-2003, 02:14 PM
Originally posted by frmky
On my Athlon XP 1800+, the optimal d parameter is -d=4.5, and there it checks 97000 p/s. I'm sure someone with a faster Athlon will report your 100,000 p/s number. :)

Greg

I can't believe my P4 2.2G only gets 50875k/s..

jjjjL
01-24-2003, 02:34 PM
I don't know if everyone saw the other thread where I posted, but this may help phil/paul for verification



http://www.seventeenorbust.com/sieve/results.txt

results below 100G not shown to stop it from being too long. it will rebuild itself every night at 3:45am EST.




It's meant to be a dirty dump from the dB just to make sure there aren't gaps in the sieve, but you could easily reconstruct it into sieve output for testing.

-Louie

PS - Athlon 1.33 = 88,000 p/sec with NBeGon 0.10 (d=3.5)

smh
01-24-2003, 02:55 PM
Just in time to give my pc a last update before flying to KL tomorrow morning.

My P3 450 MHz gets 19,2K
FSB=100MHz, 128Mb sdram(100mhz)

My P4 2400 MHz gets 55.2K
FSB=533MHZ, 256Mb ddrram (166mhz)

It seems all P4's score much lower then expected.

Any idea's why?

MikeH
01-24-2003, 03:11 PM
Original post by frmky
On my Athlon XP 1800+, the optimal d parameter is -d=4.5, and there it checks 97000 p/s. I'm sure someone with a faster Athlon will report your 100,000 p/s number.
Phil, we are there!! (frmky thx for the d=4.5)

AMD XP 2100+, Win2K, NbeGon_010 ...... 112Kp/s

SoBSieve 1.12 (alpha = 1.0) ..... 43Kp/s

(both using p=670G)
:cheers:

Mystwalker
01-24-2003, 05:25 PM
My results with NbeGon010:

1 GHz P3-m: ~63kp/sec d=4.00
900 MHz Duron: ~46kp/sec d=7.20

smh:
Did you tweak the alpha values (separately for each system)? As you can see in my example, the optimal value changes alot between different architectures!

nuutti
01-24-2003, 05:32 PM
It would be easier to test correct alpha if program could calculate
rate directly. Now using clock, paper and pencil is required.

Yours,

Nuutti

Mystwalker
01-24-2003, 05:36 PM
Well, I just let it run approx. 1 minute (started at a round value) and look where it stopped (actually, I take another round value somewhere nearby). Then I'll vary the alpha value and note how long it takes to reach that mark (from the same start, of course) - until I find the alpha with the least time needed to get there. With constant d, the time is quite constant, too, if your computer does no other tasks at that time.


btw.:
What the hell happened to the memory needs of NbeGon?!??!
Up until 009 (incl.), it needed more than 20 MByte. Now it fluctuates between 300 KByte and 3 MByte!

McBryce
01-24-2003, 06:34 PM
Hello,
I use SoBSieve 1.12 and screen output only goes to 781 factors then it stops. In SoBStatus.dat all factors are stored, also the 782nd, 783rd and so on. Is this a less significant bug? Perhaps some people think, they would not compute more factors.

Sorry for my English, but it's late and I have to go to bed now!

Martin

RangerX
01-24-2003, 07:50 PM
New NbeGon runs at ~64kp/sec on my 1gHz Athlon Thunderbird (d/alpha=3.2). SoBSieve ran ~20kp/s with same alpha, and all I did was start NbeGon from where I stopped SoBSieve (actually to make sure I don't miss anything I start it from last factor found instead of where it says it stops).

And also this scared me a lot so I want to make sure it doesn't get anyone else. If a blank line somehow gets into the submission form, it will ring up as an invalid factor. So if you get an invalid check the beginning and end of the submitted factors line for extra commas. Mine started with a comma so that's how I knew that a blank line slipped into the front.

Also, the NbeGon 010 uses 27k memory for me, so I don't know what's going on with yours.

ceselb
01-24-2003, 08:04 PM
Originally posted by RangerX
New NbeGon runs at ~64kp/sec on my 1gHz Athlon Thunderbird (d/alpha=3.2).
Maybe you've already done this, but try a higher d.


Originally posted by frmky
On my Athlon XP 1800+, the optimal d parameter is -d=4.5

FatPhil
01-25-2003, 09:43 AM
Originally posted by frmky
On my Athlon XP 1800+, the optimal d parameter is -d=4.5, and there it checks 97000 p/s. I'm sure someone with a faster Athlon will report your 100,000 p/s number. :)

Greg

I had an interesting brainwave today - I'll see if I can push you over the limit. I'm hoping I can push Louie's 88000 over too. 12% shouldn't be too hard (famous last words). This idea also splits the code into smaller, more manageable units (do one job at a time, and do it well).
Actually I had several brainwaves - one of which was to use the -march=athlon switch to gcc. That made it 20% slower!

Sorry about the P4s - they'll almost certainly need assembly. David Underbakke, the coder behind the sieve for the Yves Gallot's GFN project, has some very fast 64-bit modular maths libraries for the P4, and they'll possibly be the cure.

Phil

RangerX
01-25-2003, 09:49 AM
Originally posted by ceselb
Maybe you've already done this, but try a higher d.


When ran SoBSieve (where I could see a visible rate) I played around with the values. 3.1 gave me somewhere around the 17000's. Same with 3.3. 3.2 Gave me 19000's. I just assumed that the equations only had one peak and that was it. Is it possible to have multiple peaks so that higher alpha's might yield something higher?

Also phil, is it advisable to restart NbeGon every once in a while to reset the overflow numbers? Right now mine's at 13576|3|1|0.

Mystwalker
01-25-2003, 10:00 AM
AFAI get it, an overflow only has an impact on the very prime it tries. So the count is basically irrelevant for the speed and has tweaking purposes only.
Phil should know best, though. :)

Otherwise, there's no downside in stopping the client. If you want, you can copy the value of the last p and insert it into the batch file to get those seconds to few minutes since the last checkpoint, too.

dudlio
01-25-2003, 04:36 PM
3.2 was indeed a good alpha for SoBSieve, but it's on the low side for NbeGone. I just did a bunch of tests and found that the whole range of d=3-6 is pretty similar, but d=4-5 is the sweet spot. This is on an Athlon, and it's independent of the size of p.

Another thing: sieving goes faster as p gets bigger (!)

jjjjL
01-25-2003, 07:49 PM
I don't know if it will improve people's speeds much, but a new sieve file is online bundled with the newest SoBSieve (v1.12).

It has all numbers currently submitted removed... so all p < 25 billion (my initial range) should be gone along with anything else people had submitted up to about 10 minutes ago.

The link is updated on the sieve page but here's a direct link to the new stuff:

http://www-personal.engin.umich.edu/~lhelm/sieve112.zip

This takes the sieve file from ~815000 n values to 746961 values. This is a linear reduction of 8.3%. However, I believe the speed of the sieve is related to the sqrt(n) so that reduction is only:

903 --> 864

Giving a reduction of around 4.3%. I haven't done benchmarking, so let me know if you get the theoretical ~5% speedup from using the new sieve file.

Happy cruching! :)

-Louie

FatPhil
01-26-2003, 12:42 AM
Originally posted by jjjjL
This takes the sieve file from ~815000 n values to 746961 values. This is a linear reduction of 8.3%. However, I believe the speed of the sieve is related to the sqrt(n)
...


The n in that is the whole range, not the number of candidates remaining. So it won't speed up anything apart from the file load part of the restart times. The only algorithm that depends on the number of candidates is a linear one, and therefore very expensive for these large ranges.

Round up of other questions:
Oh - 010 did in theory have the 'G' abbreviation for 000000000 in the p range. I'm not sure I tested it though!

Regarding speeding up as p increases, that's a feature of there being fewer p to test. The density is ~1/log(p). So a chunk near 10^11 would have 10/11-ths of the number of primes in the same sized chunk near 10^10.

Overflow printouts - I'm pretty sure that no prime will ever overflow the full 6 times, and therefore I have nothing to worry about. Maths proves this to be a false assumption though, and in theory there will be a few annoying primes that slip through.
I might just shut the printouts up until there's been a 5th overflow. Given that people have faster machines than I, I will be slowing down the print statements in the next version.

Unpredictable dependency of speed on the dimension parameter - that's confusing. The working set for the DLOG is actually the same size whatever the dimensions are, as I use a hash-table, and I assume that the whole thing is fighting for the cache, and so cache-abuse should be the same whatever the dimensions are. My next tweak will hopefully decrease cache-abuse by decreasing the working set size. Maybe that will make the timings smoother. Fighting cache can be an unpredictable business, particularly in these scatter-write/gather-read scenarios. I think the P4 has a small L1 data cache, doesn't it? If so, hopefully P4s have more chance of seeing an improvement from this next change.

Phil

Moo_the_cow
01-26-2003, 01:01 AM
Does anyone know the best alpha value to use for SoBSieve
v.1.12? I think it's 1 or 3.2, but it may change for different
versions.

jjjjL
01-26-2003, 02:24 AM
ahh, i had a feeling the speed dependence might be on the range of n and not the # of candidates. in that case, the sieve file will probably not be updated too often.

-Louie

ceselb
01-26-2003, 05:50 AM
Originally posted by jjjjL
the sieve file will probably not be updated too often.

Well, the sieve should take a little less RAM now and that's a good thing imho. Perhaps another update when 1T is mostly done?

Mystwalker
01-26-2003, 07:58 AM
Originally posted by FatPhil
Overflow printouts - I'm pretty sure that no prime will ever overflow the full 6 times, and therefore I have nothing to worry about. Maths proves this to be a false assumption though, and in theory there will be a few annoying primes that slip through.

Well, I actually had one of those yesterday or the day before...
I guess it was the 009 version with a low alpha value (0.8 - which was the best setting I found). With 010, hash overflows (at speed optimal d value) have reduced much.

ltd
01-26-2003, 09:06 AM
:|ot|:

To Paul.

Are some of the performance gains also applicable for other calculations made with newpgen and will there be a new version in the months to come?

Keep up with your work.

Lars

RangerX
01-26-2003, 11:04 AM
What do you mean by overflow the 6 times? Right now my client is showing 3653|74|2|1|1|1. Is any of that bad?

ceselb
01-26-2003, 11:29 AM
Below quoted from a mail from Phil (Bold is me).

Is the proportions of "hash overflows: 9388|154|1|1|0|0" in any way
related to dimension (and performance)?

Yes. The more 'big steps' in the DLOG, the more likely my hash table is to
overflow. If the big step stage is faster then I want to make it as large as
possible. However, if I overflow the hash table I need to that prime again
with fewer big steps. Therefore an overflowed prime costs at least one and a
half times a normal prime, time-wise. So the balance is makign the first
attempt small enough such that it's as fast as possible, but large enough
that it doesn't overflow too often. Overflowing 1 in 1000 primes is nothing.
Multiple overflows chose progressivly fatter rectangles with fewer big
steps, and incur a further cost.

The balance is such a compromise that the dependency of speed on the
dimension parameter is reduced. If you chose a parameter too small, then you
get faster DLOGs, but you overflow more often, so you have to redo more often.
Likewise the other way. So it's pretty flat performance-wise on the whole.

nuutti
01-26-2003, 11:30 AM
I guess that last 1 means that you have overflowed more that six times.

Yours,

Nuutti

FatPhil
01-26-2003, 11:40 AM
Originally posted by FatPhil
I had an interesting brainwave today

Which didn't work. There's no point in me gnashing if I have no new ideas to chew on, so I've sent the 010 code, explanation and a few musings to Paul. I hope that he can either provide some new insights for new revolutions, or at least implement my algorithm more efficiently in assembly for a simple faster evolution.

Oh, 'G' doesn't work! (due to a lack of a "++e;").

Phil

jjjjL
01-26-2003, 06:22 PM
Originally posted by FatPhil
The n in that is the whole range, not the number of candidates remaining. So it won't speed up anything apart from the file load part of the restart times.

So there's no reason to update the sieve file?


Originally posted by FatPhil
The only algorithm that depends on the number of candidates is a linear one, and therefore very expensive for these large ranges.

So it's benifical to update the sieve file?

:confused:

-Louie

FatPhil
01-26-2003, 07:50 PM
Originally posted by jjjjL
So there's no reason to update the sieve file?


Indeed.

The algorithm that depends on the number of candidates is the one we don't use! (it's effectively just glorified trial-division.) Pretty useless for a situation like this, forget I mentioned it.
(I'm not telling the whole story, but it's long, complicated, and wouldn't change the way we'd do things, so it's best left forgotten.)

Sieve-file download time, and client startup time are pretty much the only things that are affected. And they're both itsy-bitsy.

If you don't like duplicates (composites removed by two or more primes), then you can reduce the number of them by having people sieve the latest file, but I can see no reason to be afraid of duplicates.

Psychologically, it might be good for the sievers to see the fruits of their (combined) labour by having them work on ever-decreasing sieve-files. I know I'd prefer to see the files shrink by a couple of thousand lines each time I started a new range. However, I've got a fast SDSL, and download times are irrelevant.

Phil

Mystwalker
01-26-2003, 07:54 PM
However, I've got a fast SDSL, and download times are irrelevant.

Those who mind downloading a new sieve file every now and then don't have to... :)

I think it's best to make an update from time to time - though it's not that critical.

Moo_the_cow
01-26-2003, 09:09 PM
2 questions:

1.) Have there been any duplicates?
2.) If so, what is the chance of a factor being a duplicate?

Mystwalker
01-26-2003, 09:41 PM
Assuming equipartition (which I think it is), the chance that a factor is a duplicate should be "<count of submitted factors>" : 746961 <-- as 746961 is the current amount of candidates in the sieve file.

dudlio
01-27-2003, 02:00 AM
With 12 k's and 17M n's, that's 200M candidates. How did we get down to 750K so quickly? Sieving?

FatPhil
01-27-2003, 04:41 AM
Originally posted by Moo_the_cow
2 questions:

1.) Have there been any duplicates?
2.) If so, what is the chance of a factor being a duplicate?

1) Absolutely. While I was soak-testing 010 before release I noticed that had didn;t output (one factor) from Paul's SoBSieve. The maths told me I was right to remove the candidate, so why didn't Paul... because the Sob.dat in the directory where his exe was was more recent than the one where mine was - that value had already been removed! And that was in the space of only a fraction of a billion.

2) On the way from 800000 to 750000 candidates, 50000 need to be removed. If you can select 50000 random numbers had have none of them identical, then you're cheating. Many numbers will be removed multiple times. I'd be surprised if there weren't over 55000 submissions in order to remove 50000 numbers.

If the central filter file is down to 700000 candidates, and you know that you are working on a file that has 800000 candidates, then your effective sieving rate is 7/8ths of what it looks like, as you can be sure about 1/8 of what you find will already have been found.

Phil

FatPhil
01-27-2003, 05:24 AM
Originally posted by dudlio
With 12 k's and 17M n's, that's 200M candidates. How did we get down to 750K so quickly? Sieving?

Yes!

These families are deliberately chosen with properties that mean they sieve down to almost nothing. Basically, vast proportions of the candidates have very small divisors, and simply by sieving with the first dozen or so primes you will throw out 96% of the candidates! (a more normal set of numbers would throw out about 80%, say).

Let's look at k=22699 (the most extreme)

> for(i=1;i<=720;++i){printf("%d %d\n",i,factor(22699*2^i+1));}
1 3
2 7
3 3
4 5
5 3
6 11
7 3
8 5
9 3
10 17
11 3
12 5
13 3
14 7
...

Just by sieving with 3,5,7,11,13,17 and 19 the only candidates that will remain will be those with n={70, 118, 142, 190, 214, 262, 334, 358, 430, 478, 502, 550, 574, 622, 694, 718} mod 720. i.e. 16/720 remain, or 2.22%!

And this high density in composites is exactly the reason why they have such low density in primes (to whit currently none are known).

Phil

RangerX
01-27-2003, 11:01 AM
Sorry for the rather simplistic question, but basically we're sieving to remove easily-found composites so that we don't have to check them for primality (is that a word?) later?

dudlio
01-27-2003, 11:31 AM
basically we're sieving to remove easily-found composites so that we don't have to check them for primality (is that a word?)

Yes, and yes :) Sieving works in the opposite direction from prp (primality) testing. In prp'ing, we do k*2^n+1 = result and see if the result is prime. In sieving, we start with a bunch of possible results (actually, factors of results) and solve for n.


Just by sieving with 3,5,7,11,13,17 and 19 the only candidates that will remain will be....2.22%!

Awesome. Thanks Phil :cool:

paul.jobling
01-27-2003, 11:47 AM
Hi all,

This is a very quick note, as I have been busy over the last four days as my partner has been in hospital for surgery on her knee. In between visiting the hospital, I did manage to find a few hours to do some improvements to SoBSieve.

First of all, I moved the main body of the code into assembler, which gave a 50% improvment in speed. That still left me slower than Phil, which I couldn't understand as I couldn't see what else could be done. However I was missing a *very simple* trick in the code, which Phil passed on to me (thanks Phil! :thumbs: ). It only took about half an hour to implement a naive version and test it, and the end result of all of this activity is that this version is well over twice as fast as 1.12.

I still have some work to do - the assembler code needs to be profiled to get it as fast as possible, and I should really implement a smarter version of the trick - as I am sure Phil will be as well.

I have given it a cursory test and found no problems, but if somebody can give it a quick parallel run and report back to the group that would be useful (I would, but I have got to get to the hospital). I do not expect any problems, but nothing is certain in software development.

Regards,

Paul.

FatPhil
01-27-2003, 01:01 PM
Originally posted by paul.jobling
well over twice as fast as 1.12.


It sounds like you're still at my 009 tech. Unrolling by 2, which was my 009-010 speedup, gave another 50%. The C I gave you _should_ translate 1<->1 into assembly, once you've mapped variables onto mobile stack entries (bletch, which is why I use C rather than asm).

Phil

MikeH
01-27-2003, 01:54 PM
AMD XP 2100+, Win2K:

NbeGon_010 (alpha = 4.5) ...... 112Kp/s

SoBSieve 1.20 (alpha = 1.5) ..... 102Kp/s

(both using p=670G)

paul.jobling
01-27-2003, 03:44 PM
It sounds like you're still at my 009 tech. Unrolling by 2, which was my 009-010 speedup, gave another 50%. The C I gave you _should_ translate 1<->1 into assembly, once you've mapped variables onto mobile stack entries (bletch, which is why I use C rather than asm).

Ah, well my C code and your C code were doing the same thing, but I understand mine! I'll just stick with the convertion of my code into assembler, as I know what it is up to, and I have already done it. But the two programs were probably doing exactly the same thing but in slightly different ways - you hash is completely different to mine, for example. Now that I have got that last piece of the puzzle I can start profiling the code....

Cheers,

Paul.

Moo_the_cow
01-27-2003, 03:58 PM
Paul, thanks for the new version :D

However, with that version, my rate skyrockets to 40,000 p/sec
when I start the client, but falls to 35,000 p/sec a few minutes
later. Should I just keep restarting the client every half an hour
or so to improve production, or is this just a bug in the client?
Have any other SoBSieve v.1.2 users had this problem?

P.S.

"Sieving a G a day keeps many PRP tests away"

jjjjL
01-27-2003, 08:15 PM
duplicates are pretty common. of the numbers left in the database,

714883 have no factors (yet ;))
181629 have exactly 1 factor
5987 have 2 known factors
232 of them have 3 known factors
6 of them actually have 4 known factors

this excludes all the numbers below 1G which were never placed on the server. in other words, the sum of all the above should be about equal to the number of candidates left after sieving from 0 - 1 G.

4847?2^19202151+1 is one of the highly composite numbers...

120435469129 | 4847?2^19202151+1
15608701133 | 4847?2^19202151+1
34133798287 | 4847?2^19202151+1
224192486249 | 4847?2^19202151+1




on another totally seperate note, i was thinking about was p-1 factoring...

phil, paul, do either of you have an intelligent opinion about the viablity of doing this? as prp test times increase, the benefit of doing it seems to increase too. i was toying with making it into a java applet the other day. i knew it would run slow-ish but i was thinking about how it would be cool if there were a webpage you could just go to that would start doing work for the project on *ANY* computer you were on w/o having to download and install software or even be on a system capable of running SB/SoBSieve. my java app idea is a little ambitious and mostly academic, but it would be interesting. anyone interested?

-Louie

paul.jobling
01-28-2003, 06:32 AM
However, with that version, my rate skyrockets to 40,000 p/sec when I start the client, but falls to 35,000 p/sec a few minutes later

This is probably just an artifact of the rate sampling. I see someting similar; the first couple of rate reports are always higher than the steady state.


AMD XP 2100+, Win2K:

NbeGon_010 (alpha = 4.5) ...... 112Kp/s

SoBSieve 1.20 (alpha = 1.5) ..... 102Kp/s

Ah, that is the same system as I have here, so I can confidently report that the attached SoBSieve 1.21 will run at 129Kp/s at 670G with alpha=2.5 (note the new value!)

Phil - this speed up came from doing the last improvement properly - ie for each k, we find a suitable multiplier, then rather than multiplying by 16 (as at the moment), it multiplies by 2^48, 2^24, etc, as approriate.


on another totally seperate note, i was thinking about was p-1 factoring...

Well, we are currently pulling out 12 digit factors very quickly through trial division - and we are looking at a whole load of number all together. For p-1, though, you would be looking at one number (eg 4847*2^14342252+1) at a time. So this should be compared to the PRP testing time; but remember that all it can do is either eliminate a number with a small factor; if it doesn't find a small factor the number has got to then be PRPed anyway.

Hmmm - I think investigation is required - I wonder if it would be worth putting x% of the PRP time aside to do this? It is a win if >x% of numbers are proven composite through the p-1 test...

Regards,

Paul.

olaright
01-28-2003, 07:05 AM
:|party|: :|party|: :|party|:

On my computer Sobsieve again runs the fastest :->

You say you get 129K; don't let you distract by this big number!! Think of it as small, I am sure if you take 200k as "the goal" , you can do it :notworthy .


Sieving is so important!! I have to run my computer for ten days,
just to test one number. So as long as you sieve more than 1 number/day it is worth it!

Keep up the good work :cheers:

paul.jobling
01-28-2003, 10:45 AM
Originally posted by ltd
:|ot|:

To Paul.

Are some of the performance gains also applicable for other calculations made with newpgen and will there be a new version in the months to come?

Keep up with your work.

Lars

Hi Lars,

Yes is the quick answer- as far as the fixed-k sieves go anyway. There is a new version in beta test right now that doesn't include all of the changes that I have made in the last week or so, but is still significantly faster than the previous release.


You say you get 129K; don't let you distract by this big number!! Think of it as small, I am sure if you take 200k as "the goal" , you can do it .

I have been optimising the assembler this morning, in between tending to Karen's every (k)need, and now SoBSieve manages 148K p/s. To go significantly faster would almost certainly need a new algorithmic or mathematical insight.

I will probably be posting the next version at 8pm GMT (is the time in GMT for everybody?). Hopefully I can get a little more out of it, but don't hold your breath.

Regards,

Paul.

paul.jobling
01-28-2003, 01:57 PM
Well, this is version 1.22. It is 8% faster that 1.21 on this PC, and unless I have a really good idea or I find a major optimisation, this will be the last release for a while.

Regards,

Paul.

olaright
01-28-2003, 02:58 PM
When i start the client this message appears:


WARNING: 64 bit integers are badly aligned as qwords!

but it seems to keep calculating, so i guess nothing to worry about.

I have PIII 450 192mb ram; win 98; alpha 2.5
tested >=200G

btw I think I get >8% faster!! great job :)


cheers

Ola

rogue
01-28-2003, 03:23 PM
I get the same as olaright, but it stops processing.

The application quits saying that "An attempt was made to execute an illegal instruction (0xc000001d) occurred in the application at location 0x00401618"

add ecx,dword ptr [ebp-0Ch]
adc edi,dword ptr [ebp-8]
cmovb eax,ecx <-- failure point
cmovb edx,edi

I have a AMD K6-2 at 450 MHz with 256 MB ram.

--Mark

ltd
01-28-2003, 03:43 PM
I have no problems with the new version.

Tested on Intel P III 700 with win 2K
and AMD XP2000+ on Win XP pro.

Lars

Mystwalker
01-28-2003, 04:36 PM
Works good on my systems, too (Win2k).
Got a nice speed increase on my Duron system: ~45kp/sec (NbeGon 010) -> ~80kp/sec (SoBSieve 1.12) :)
Speed of my Intel system stayed more or less the same, though...

jjjjL
01-28-2003, 07:38 PM
Yea! SoBSieve is much faster now... even faster than (the albeit more portable) NBeGon:

Athlon 1.33GHz WinXP
----------------------------------------------------
88k p/sec with NBeGon v0.10 (d=3.5)
117k p/sec with SoBSieve v1.22 (a=2.75)

33% speedup is always nice :)

I'd recommend people sieving in windows use SoBSieve again. Ideally, windows machines should go back to SB when sieving loses most of it's steam.... however, with all the speedups, it's hard to say where that will be exactly.

I remember initially setting a goal of 1T before n > 3000000 start getting assigned. But hey, that was in the ancient history of... 8 days ago ;) (http://www.free-dc.org/forum/showthread.php?s=&threadid=2370&perpage=25&pagenumber=2) ... back when NBeGon didn't exist, SoBSieve was v1.06, and the best rates being reported were around 9000 p/sec. Fast forward a week to the present, and we're up to around 100k p/sec on the same kinds of computers! That changes the optimal sieve limits quite a bit. Unfortunatly, SB hasn't managed to go 11x faster in the last week like the sieve. Long story short: sieving higher than we planned makes sense now.

I think the new goal should be 5T. 10T would be closer to optimal but may be unrealistic to do in the next month. We'll see. If I put The Man (http://cac.engin.umich.edu/) to work on it, 10T may be realistic after all. What do you guys think?

-Louie

ceselb
01-28-2003, 07:49 PM
Phils page (http://fatphil.org/maths/sierpinski/) has some info on optimal sieve depth. I'd calculate it myself, but I don't know how long a PRP test takes.

Moo_the_cow
01-28-2003, 09:48 PM
Yeah!!!!!!!!

SoBSieve v.1.22 alpha 2.75 ----- 60,000p/s :D
(PIII 800 MHZ)

ahhhh.... what a contrast to the prehistoric days
when everyone was sieving 10X slower.

dudlio
01-29-2003, 01:33 AM
I've been trying to estimate the optimal sieve depth for about a week now. I have a hunch it's in the hundreds of T.

I generally assume that a PRP test takes 1 day, that the project will go to 20M, and that an average machine is 1.5 Ghz, aka Athlon 1.2, what I have :)

I can sieve about 6G per day. As you can see from Nuri's graph (http://www.free-dc.org/forum/showthread.php?s=&threadid=2442), the rate of removal halves with each doubling of p. If you extend that graph, you get 0.12 factors per G (one per day) somewhere around 200T.

Here's another view, based on Phil's statement (http://fatphil.org/maths/sierpinski/) that the number of candidates remaining is proportional to natural log p. As of 25G we had about 750,000 candidates. At 200T we will be down to 550,000.

Sieving to 200T is a month's work, systemwide, and will remove 6-7 months' PRP work. Sieving to 400T is another month's work, and will remove an additional 12k candidates, or only a half-month's work.

So there you have it :) Of course, PRP'ing will slow down, so we might go to 500T or a quadrillion or two, but at that point you're talking about very long timeframes and the ability to predict breaks down.

jjjjL
01-29-2003, 02:40 AM
yeah, you're right on the high sieve limit being optimal thing. there are several things your model most likely doesn't take into account that may cause you to overestimate the amount of sieving that should be done... such as repeated factoring of the same number, finding a prime for a given k, etc but most of that can't be accurately predicted anyway. the rule of thumb for those of you who aren't as into the math should probably be:

if you want to sieve it, you probably should ;)

in other words, I don't think that it is likely that we will ever reach the optimal sieve level no matter how many people sieve.

however, it is likely that we will get arbitririly close to the optimal level. in other words, the levels of sub-optimal-ness of our sieve limit will not be too high so long as at least a decent threshold like p = 1 - 5T is achieved. if you were to picture it as a graph, it would look something like:

http://www-personal.engin.umich.edu/~lhelm/approx.jpg


for those of you who are into the more intiricate, mathy details, some of my comments on the matter can be found here:

http://www.free-dc.org/forum/showthread.php?s=&threadid=2424


-Louie

paul.jobling
01-29-2003, 04:44 AM
Hi guys,

First of all ,the software will not work with the K6 - sorry :( That is because it requires the CMOV instructions, which are available on practically every other processor in use today. I should put a check in for that.

As for the message WARNING: 64 bit integers are badly aligned as qwords!, this is only a warning and while the software will run fine, it will just not be as fast as possible. This is something that I intend to fix today, if I can - it is probably just a few instructions to the compiler.

Regards,

Paul.

FatPhil
01-29-2003, 11:56 AM
Originally posted by jjjjL
Yea! SoBSieve is much faster now... even faster than (the albeit more portable) NBeGon:


Yeah, NbeGon stopped development (or I stopped development on it), 30 seconds before I handed my source code and description to Paul. I realised that the only thing that I could do further was optimise, and as Paul said from the outset this shouldn't be an optimisation competition. I'd spent the previous week working out _algorithmics_, proof of principle, not optimising. As a proof of principle I don't think it was too bad.

Paul's software has since then had two major algorithmic boosts, one of which I had already put in NbeGon, the other of which isn't, it was merely <taps head> up here. So in theory I _could_ put together one final NbeGon, which would be at the same "tech" as Pauls, but not as highly optimised, as mine would be all C.

To be honest, I've not looked at the problem, save exchanging mails with Paul, for several days, I've had several other things on my plate. (Anyone want a wacky replacement to Pollard's P+1 factoring algorithm?).

Anyway, back on topic, I noticed mention of using candidate-by-candidate factoring algorithms in the thread, and it's worth trying to analyse to see what they say:

If we pluck a figure out of the air for sieving depth, say 10T, and pretend that the average machine does <100000p/s, then we conclude that the job's >1000 CPU days. What else could we do in that number of CPU days? If we were to sieve half way, then we'd have 500 days spare for factoring. Say we have 700000 candidates left (I'm in 1s.f. land here), then that's about 1 minute per candidate. You can't do anything useful in one minute. Not even 10 minutes would be useful. Once you know no factors are below 5e12, and you're time limited, you're looking probably at just running Pollard's P-1, with B1>10000, B2=B1*30, and on numbers of this size that's just prohibitive.

i.e. sieving 700000 candidates is _much_ more cost-effective than factoring.

The reason GIMPS use P-1 and ECM to factor candidates is because they _can't_ sieve. Even the old style trial-division NewPGen was probably more effective than factoring, and we're 100 times faster than that now.

Phil

MikeH
01-29-2003, 03:15 PM
Phil said
Paul's software has since then had two major algorithmic boosts, one of which I had already put in NbeGon, the other of which isn't, it was merely <taps head> up here. So in theory I _could_ put together one final NbeGon, which would be at the same "tech" as Pauls, but not as highly optimised, as mine would be all C.
Phil, I would be very very grateful if you could put together that one final version. I've found that your Windows version is very easy to put together in an automated p range distributor, and factor collection.

10% faster now would mean 10% faster from here on in. :help:

Paul's client is great on a 'regular' desktop PC, but not on unattended PCs.

Please, please, please. And how about the Linux guys - do it if only for them. Please, please please. :notworthy

dudlio
01-30-2003, 05:24 PM
Is there a reason why nbegon 1.0 for windows would fail to write a batch file upon exiting? I'm playing around with it and the sob.bat file isn't appearing like it used to.

garo
01-30-2003, 05:35 PM
Hi Louie,
Would it be possible to package a new Sob.dat - esp since ceselb has returned 25-50G - with 1.22 and put that on the sieve page? I'm sure the number of tests has gone below 746k now.
Thanks,

FatPhil
01-30-2003, 06:26 PM
Originally posted by dudlio
Is there a reason why nbegon 1.0 for windows would fail to write a batch file upon exiting? I'm playing around with it and the sob.bat file isn't appearing like it used to.

Odd. I decreased the progress reporting frequency by a factor of 8, and in theory didn't touch the batch file frequency (10 mins IIRC). It doesn't write a file when exiting, only when a 10 minute period expires, so if you do a really short run you won't get a batch file.

I'll be looking at NbeGon again this weekend, as I've just completed my new factoring algorithm. I shall bear this issue in mind as I reacquaint myself with my code.

Phil

garo
01-30-2003, 08:41 PM
Phil,
Could you change it to write a batch file when it exits?
thanks

PS: Any chances of a ver 010 or 011 for Sun?

dudlio
01-30-2003, 09:14 PM
OK, it works. I wasn't letting it run long enough. Next question: can Tk spawn a separate thread?

Heh. I can make a GUI that runs nbegon but the gui hangs lol.

jjjjL
01-30-2003, 09:27 PM
A new sieve file is bundled with v1.22 of SoBSieve on the sieve page

http://www.seventeenorbust.com/sieve/

Reduced from 746961 -- > 674551 candidates left.

All currently submitted factored numbers have been removed, but the lower bound is technically only p < 100 billion.

Just so you know, this file shouldn't increase performance or reduce memory usage. However, it will make it so that less numbers will have multiple factors submitted so I recommend you upgrade anyway. It is possible that there may be a slight speedup since less factors will need to be verified. Enjoy :)

-Louie

Slatz
01-30-2003, 10:27 PM
only been sieving for a day now...running v1.22 on a dell machine with a P4 1.8 ghz and 512 ram.

Only getting about 55000 p/sec with an alpha of 2.9

that seems awful slow to me for this computer....

Slatz

ceselb
01-30-2003, 10:55 PM
The PIV isn't really that fast at this. :(
My 1.5 gets 41kp/sec, so your figure is quite normal.

kmd
01-30-2003, 10:56 PM
It seems like P4's aren't doing as well on sieving as compared to other processors. My guess is that this type of work isn't as heavily SSE2 optimized as the normal SoB client, so you're numbers may seem bad compared to others. I was only getting around 55000 p/sec with a 2GHZ P4 too, so don't think it's just you.

Slatz
01-30-2003, 11:04 PM
thanks for the replys...guess it will just crunch along at whatever speed it will go:p

Slatz

FatPhil
01-31-2003, 04:05 AM
Originally posted by garo
Phil,
Could you change it to write a batch file when it exits?
thanks

PS: Any chances of a ver 010 or 011 for Sun?

The program doesn't exit cleanly unless it's finished. The only other way to stop it is to kill it. So detecting that requires signal handling, whichI can do under most Unices, but would know how well it translates into 'doze.

I stopped NbeGon at 009 on sun because the maths in the floating point unit was slower than the integer unit, unlike all other processors. 010 relied on the FPU more, and 011 relies entirely on the FPU. I don't reckon that the small speedup I'll get in 010->011 will be enough to counter the slow-down. I'll try it, obviously, but if it's slower than 009, then I won't be surprised

Phil

peterzal
01-31-2003, 12:42 PM
what would be the optimal alpha value in sobsieve 1.22 for an amd 2000+ running windows xp?

Mystwalker
01-31-2003, 12:58 PM
Unfortunately, a decent benchmarking option is missing. :(
The only chance I see is to set a value and look at the rate. But this is very inaccurate.
With NbeGon, there's a better way of finding the optimal alpha value, though...


I guess your alpha value should be a bit higher than the default 2.5 - it's 3.4 @ my Duron. Don't know if the architectural differences of the AthlonXP change this a lot...

garo
01-31-2003, 01:21 PM
Originally posted by FatPhil
The program doesn't exit cleanly unless it's finished. The only other way to stop it is to kill it. So detecting that requires signal handling, whichI can do under most Unices, but would know how well it translates into 'doze.
Phil

Okay! I don't know if anyone else is using Sun so it's probably not worth your time to put in a signal handler. Let's see if other people - esp Linux folks want it.

Thanks for trying and seeing if the FPU on Sun can go faster. BTW, what machine do you have/are compiling on. I am running this on a dog slow Sparcstation5 110 Mhz and still getting a decent 1700p/sec. If you do not minding sending me the src code I could see if I can try and optimize this on my machine. But since there really aren't going to be very many people using that machine I'll understand if you don't want to.

MikeH
01-31-2003, 02:07 PM
As a fairly serious user of NbeGon on 'dose right now, for me a signal handler would be no gain.

The client gets killed, I lose (at most) 10 minutes work. No problem.

FatPhil
01-31-2003, 04:26 PM
Originally posted by garo
Okay! I don't know if anyone else is using Sun so it's probably not worth your time to put in a signal handler. Let's see if other people - esp Linux folks want it.

Thanks for trying and seeing if the FPU on Sun can go faster. BTW, what machine do you have/are compiling on. I am running this on a dog slow Sparcstation5 110 Mhz and still getting a decent 1700p/sec. If you do not minding sending me the src code I could see if I can try and optimize this on my machine. But since there really aren't going to be very many people using that machine I'll understand if you don't want to.

When I do an 011 build (this weekend) I'll gather together a source archive. I can't distribute the full source, but I can point you to the missing files - a marginally slower version of the prime generator (that mine's based on, and hence mine is not 'mine').

There's a far easier method of losing less than 10 minutes, and that's to have the save time more frequent. 1 minute fast enough for ya?

Phil

garo
01-31-2003, 07:32 PM
Oh yeah!! I think 1minute is cool! Don't worry about anything else. Just make it 1min :)

RangerX
02-01-2003, 10:30 AM
Originally posted by MikeH
As a fairly serious user of NbeGon on 'dose right now, for me a signal handler would be no gain.

The client gets killed, I lose (at most) 10 minutes work. No problem.


Actually, I found that if you take a screenshot right before you kill the program (alt-printscrn when you have focus on nbegon), you can use the last number it outputted and start it from there. Then you lose almost nothing besides the time it took you to end and restart the program.

Mystwalker
02-01-2003, 10:53 AM
You can just mark the p number and press the right mouse button (under W2k; with other Windows versions, it's a little different). That way, you copied the value and can paste it into the batch file.

rogue
02-02-2003, 01:52 PM
A question for Phil. Which version of GCC did you use for the Mac OS X version of your sieve? What level of optimization was used?

--Mark

jjjjL
02-02-2003, 04:05 PM
random stats:

667094 candidates 3000000 < n < 20000000

243210 factors submitted total

143218 factors submitted by me [i did do the lowest range plus i have access to super computers]

17647 unfinished tests n < 3000000

70 factors submitted for n < 3000000 [by me and one other person]

all in all, things are looking good. :cheers:


-Louie

FatPhil
02-02-2003, 06:48 PM
Originally posted by rogue
A question for Phil. Which version of GCC did you use for the Mac OS X version of your sieve? What level of optimization was used?

--Mark

"""
$ gcc --version
[xxxxxxxx:~/Temporary Items/Prime] music% gcc --version
gcc (GCC) 3.1 20020420 (prerelease)

$ uname -a
Darwin xxxxxx.xxxxx.xxx.xxx 6.3 Darwin Kernel Version 6.3: Sat Dec 14 03:11:25 PST 2002; root:xnu/xnu-344.23.obj~4/RELEASE_PPC Power Macintosh powerpc
"""

Compilation options -O3

The problem with the Mac is that its only a 32-bit processor, and, like the P4, is relatively underpowered when using just the ordinary FPU. You've got to use the SIMD instructions to get it to fly.

A 32-bit processor is seriously hindered for this kind of work, as a 64-bit product requires 4 times the work.
Our baby-steps is unusual in that we don't double or halve, we do full modular multiplies, and with the x86 it's possible to do some of those entirely in the FPU. With the less capable FPUs (all other chips), there isn't that option. So non-x86 are partly at a disadvantage, but PPCs are doubly disadvantaged.

Sorry, that's just the way that things are.
My bet is that it really flies on a proper Power4 chip, however...

Phil

rogue
02-02-2003, 08:15 PM
And it's also unfortunate that AltiVec can only handle 32 bit floats and not 64 bit ones. I was curious as to the version of GCC as I know (from personal experience) that GCC 3.1 optimizes much better than GCC 2.95 in OS X. I've seen 10 to 15 percent gains on stuff I've written.

I don't have access to a Power4, but if the rumors are true about Apple using IBMs 970 chip (a stripped down Power4), then one would get both 64 bit support and 2 FPUs to boot.

In case you or anyone else is curious, I get about 6400 p/sec with d=2.0 on a G4-500. Its a little higher with d<2.0, but there are many more rehashes. At the current rate, I will finish the range in about 9 days.

--Mark

paul.jobling
02-03-2003, 09:54 AM
Well, here is a 1.24 release. This contains the following:

- minor GUI changes to ensure that it all looks nice;

- the selection of displaying rate or status is toggled by clicking in the box, rather than through a menu item;

- when the range is finished, the data is automatically placed in the clipboard ready to be pasted into the web page;

- the priority and info display selection are saved and restored when you run again;

- there may be a slight performance gain on some processors - I see a boost for the Athlon; no gain for PIII.

Regards,

Paul.

Nuri
02-03-2003, 03:01 PM
Originally posted by paul.jobling
- the selection of displaying rate or status is toggled by clicking in the box, rather than through a menu item;
Cool

Originally posted by paul.jobling
- there may be a slight performance gain on some processors - I see a boost for the Athlon; no gain for PIII.

8% increase for my P4-1700. Even cooler.

Nice job Paul, thanx. :thumbs:

Mystwalker
02-03-2003, 04:59 PM
I see a boost for the Athlon; no gain for PIII.

Well, ~5% speed increase on my Duron, but 11% on my P3! :|party|:

Good job, Paul! :cheers:

Troodon
02-03-2003, 06:00 PM
Here are some "alpha benchmark" results of my Celeron Tualatin @ 10x112 MHz (SoBSieve 1.24):

Alpha | ~p/s
------------------
1.000 | 92000
1.500 | 96500
2.000 | 98000
2.250 | 98000
2.500 | 98000
2.750 | 97500
3.000 | 97000
3.250 | 96500
3.500 | 95500
5.000 | 90500

peterzal
02-04-2003, 12:39 AM
yay i got about 10% increase on my p3 600. great job :)

rogue
02-04-2003, 04:03 PM
I have a request for one minor change for NgeGon. Can you put in an option to state how frequently you want to see output? Currently it is 10 seconds, but it would be nice to change that to minutes or even hours. Thanks.

--Mark

FatPhil
02-04-2003, 04:33 PM
Originally posted by rogue
I have a request for one minor change for NgeGon. Can you put in an option to state how frequently you want to see output? Currently it is 10 seconds, but it would be nice to change that to minutes or even hours. Thanks.
--Mark

I was going to implement it that way. I've not had a chance to work on it yet, as when I finished my other project I fell ill with a flu which has prevented me from thinking straight. I'm psyching myself up for tackling it tomorrow.

Phil

paul.jobling
02-06-2003, 07:37 AM
For those of you who are real speed-freaks, I attach a console version of the SoBSieve software. Since there is no Windoze overhead, this is about 10% faster on this NT 4 system (!).

If you run it with no SoBStatus.dat file, it will prompt for the range of p. If there is a SoBStatus.dat file present it will continue from where it left off.

If you want to use it in an unattended manner, it will go through the range then exit. To set up a range, create a SoBStatus.dat file containing two lines,

pmin=<start>
pmax=<end>

for example

pmin=100000000000
pmax=125000000000

Note that it runs at idle priority.

Regards,

Paul.

peterzal
02-06-2003, 08:27 AM
can you make it so that we can change the update interval (how long it takes before it prints out the status to the screen)? this one is way too short heh.

paul.jobling
02-06-2003, 09:25 AM
Originally posted by peterzal
can you make it so that we can change the update interval (how long it takes before it prints out the status to the screen)? this one is way too short heh.

And the problem is....????

garo
02-06-2003, 01:06 PM
Aaargh! The console version does not write the pmin value to SobStatus.dat . If I stop the program using Ctrl-C and restart it does not remember the work done.

Also there doesn't seem to be a way to adjust alpha.

paul.jobling
02-06-2003, 01:11 PM
Ah, it writes a new pmin every hour, so you shouldn't lose more than an hours work.

As for the alpha bit - you are right. It uses 2.5 by default; I probably ought to allow that to be changed.

Paul.

garo
02-06-2003, 02:22 PM
Every hour?? How about making it 10 minutes or something. Or atleast writing it when the command window is closed or Ctrl-C is used.

jjjjL
02-06-2003, 06:08 PM
Paul - great job!

I installed the console sieve on a few computers and it runs great.

Now that you have the console working, is there any hope in building SoBSieve for linux again? Right now I'm running NBeGon10 on around 20 dual proc boxes at the U of M clusters at any given time. It's cool seeing them burn though a 600G range of divisors every 3 days. :) Of course, it'd still be cooler to see it crunch though 1T every 3 days :D

Keep up the good work. :)

-Louie

Moo_the_cow
02-07-2003, 06:01 PM
quote:

___________________________________________

Every hour?? How about making it 10 minutes or something?
____________________________________________

Well, Paul will probably fix that (unless he's feeling
particulary lazy ;) )
In the meantime, you can type the last pmin value that
the console version shows to the SoBStatus file.

Mystwalker
02-07-2003, 07:29 PM
But be careful:
When you've started the program via the Explorer, the console will close once you press CTRL-c. In this case, you have to copy the value first and then shut down the program...

dudlio
02-09-2003, 12:57 AM
Congratulations to both Paul and Phil. We now have a pair of very speedy sieving engines that will allow us to hit the 10T mark in a couple weeks with just a small group of participants.

Now that things are settling down again, I'm wondering if anybody is thinking about further automating the sieving process. I enjoy the manual work involved in sieving, but at some point it's going to become hard to coordinate.

For example, I'm splitting work between two machines, and that alone is a source of potential error. Also consider, if we wish to sieve to 200T in 50G increments, the Sieve Coordination List will grow to 4000 entries.

Mystwalker
02-09-2003, 07:07 AM
I'm afraid my proposal in the "Sieve Coordination Thread" thread was deleted prior to Louie reading it. Ok, was a bad idea to post it there... :(

Anyways, I suggested to use a DB table for the ranges and a scripted webpage where one can reserve ranges. Should be implementable with only little effort - less effort than keeping the coordination thread up-to-date. Plus, no one can accidently reserve ranges already given to someone else.
Another recent idea (due to dudilo's posting) was to let that page generate the proper SoBStatus.dat file - which can be customized to split the range over multiple machines...

jjjjL
02-11-2003, 02:05 PM
we are down to 644675 numbers left.

-Louie

EDIT: You can now view the # of tests remaining for n < 3 mill and n > 3 mill on the stats page (http://www.seventeenorbust.com/stats/). Now it is easy for everyone to see how much sieving helps. :)

Mystwalker
02-13-2003, 02:11 PM
I'm not sure if there already was a program that gets rid of the pmin & pmax lines of the SoBStatus.dat file (as it's needed when using sobsieveconsole), so here's my version. As it's written in Java, everyone should be able to use it...

MikeH
02-13-2003, 02:46 PM
Loiue said
EDIT: You can now view the # of tests remaining for n < 3 mill and n > 3 mill on the stats page. Now it is easy for everyone to see how much sieving helps.
This is great.:thumbs: Any chance you could add a detail page that breaks down the >3000000 range e.g.

3M - 4M x outstanding
4M - 5M y outstanding
..

Another really useful addition would be a 'lowest sieve point' indicator. In ~12 days time n > 3M will begin to be distributed. Although the SoB.dat file doesn't need to be updated right away, it would be rather nice to have somewhere to get a new nmin value, then we sievers could just update every now and again knowing that the value is correct right now.:)

nuutti
02-13-2003, 03:27 PM
Changing sieve range does not speed sieving a lot but sieving
from 3,000,000 will help us in double cheking.
When we will find a factor we will know for sure that number wasn't prime. Bacause primes are so rare in these numbers we can't afford to miss any primes.

Yours,

Nuutti

P.S. Right now I wish that we are just unlucy and there is not bug in the client or any false report about prime being not prime(because we have not found any primes so long time).

MikeH
02-13-2003, 04:13 PM
Nuutti, you're right on leaving the nmin @ 3M. I've just remembered a discussion before the latest sieving effort started, where it was stated that sieving 3M-20M wasn't significantly more effort than 3M-5M (which was the original suggestion). I guess the converse is true.

On the double check subject, maybe at some point in the future when the current 3M -20M sieving has exhausted the easy factors, it may be worth sieving 0 - 3M deeper as an aid to double checking. We already have the tools, we just need an appropriate sob.dat file, and a new co-ordination thread.

paul.jobling
02-14-2003, 04:38 AM
On the double check subject, maybe at some point in the future when the current 3M -20M sieving has exhausted the easy factors, it may be worth sieving 0 - 3M deeper as an aid to double checking. We already have the tools, we just need an appropriate sob.dat file, and a new co-ordination thread.

It is *very* easy to produce an appropriate SoB.dat file using SoBSieve - producing the initial file is part of its functionality.

If anybody is interested in doing this, let me know - or just start the new thread, give me a prod, and I will post the method in there.

As for the lack of any new results for a while: finding three primes in such a short amount of time was very very lucky IMHO - perhaps it might have been better for the project to find them at longer intervals. In the 3 to 20 million range I expect around 5 or 6 more primes to be discovered. The important thing is that when one is found, it will be one of the largest known primes - perhaps *the* largest.

Regards,

Paul.

nuutti
02-14-2003, 02:47 PM
I have done that sieving file from 1->3,000,000 and it is sieved up to 5G

The file is here :
http://powersum.dnsq.org/doublesieve/

Nuutti

MikeH
02-14-2003, 04:24 PM
Thanks Nuutti. I'll give 5G - 6G a quick shake down just to see how quickly those factors are eliminated.

nuutti
02-14-2003, 05:15 PM
I used old good 1.06 because it can create sob.dat file.
Sieving was slow. I don't know if newer clients are faster with low n values ?

I tested with SoBSieve 1.24 with p about 2,000 G and n 3,000,000-> 20,000,000 and got rate 28,000 p/s
and then
SoBSieve 1.24 with p about 2,000 G and n 1-> 3,000,000 and got rate 13,600 p/s

May be Phil's software is faster when n is low ?

I have PII 450.

I think that sieving in the range 3,000,000 -> 20,000,000 should have priority right now, but when we have reached 10T we should consider double check sieving.

Ranges under 3,000,000 are partly sieved very high, but I guess that sub ranges n<1,000,000 are not sieved very high. Because speed of sieving does not depend very much n range we can use 1<n<3,000,000 range all the time.

Nuutti

nuutti
02-14-2003, 05:28 PM
I made a small test run SoBSieve vs. NbeGon_010
and here are my results :
Test SoB.dat was a file for 1->3,000,000.
Range was 20G -> 20G+1,000,000.
Computer was PII 450.
Time to complete the range :
SoBSieve 80s.
NbeGon_010 34s.

It seems that Phil's NbeGon_010 is about 2.3 times faster than
Pauls SoBSieve when n values are small.


Nuutti

MikeH
02-14-2003, 05:51 PM
I see roughly the same.

Paul's is about half the speed for 1<n<3M as I'd expect to see for 3M<n<20M. Phil's is about 2.5x faster than Paul's in this low n range.

Now I've started another test for p = 6G - 7G, I'll let it run, submit the factors, and see how it looks.

In the p = 5G - 6G range, I submitted the first three factors, all of which were new! Does this really mean that this area really wasn't sieved for all values of k? Or is this area not reflected in the DB? Or is there just something odd because these three values have been PRP tested? Or am I just missing something?

I am confused. :crazy:

nuutti
02-14-2003, 07:12 PM
I have to admit that I don't know what Louis' database like these factors, but you can mail all factor to me. I will keep a a list about found factors for a while.

my e-mail is nuutti_dot_kuosa_at_kolumbus_dot_fi

Nuutti

philmoore
02-14-2003, 07:16 PM
Nuutti said:

> I think that sieving in the range 3,000,000 -> 20,000,000 should have priority right now, but when we have reached 10T we should consider double check sieving.

I think it isn't too early for a couple of people to be doing double check sieving for the 12 exponents for n=1 to 3,000,000. Consider: suppose hardware errors cause, for example, one out of every 1000 tests to be in error. Suppose a prime is missed because of a hardware error. The work to test up to n goes roughly as n^3. If a prime was missed at, say, n=2,000,000, we could easily test up to 20,000,000, doing 1000 times as much work, before finding a prime. The small chance of missing a prime means that double checking can lag quite a bit behind first time checking, but there is definitely a point at which there is a better chance of finding a prime by double checking a small value of n than first-time checking a large value of n. On the one hand, it would be nice to get some statistics about error rates, but double checking really doesn't have to start right away; it would be advantageous first to get some well-sieved values before starting.

MikeH
02-15-2003, 06:18 AM
To try to prevent confusion for first time users, I have created a new thread for double check sieving discussions. :)

Joe O
02-17-2003, 10:27 PM
Well the console version does run at least 10% faster than the windows version, just as Paul Jobling stated.

But what I find very interesting is that both sieves run 11% faster under Windows NT than under Windows 98SE,
even though the NT machine is a Dual Celeron/450, and the 98SE is a PIII/500.
So the real difference of NT over 98SE is 23% (or more if we take into account the Dual penalty!)

Mystwalker
02-18-2003, 08:04 AM
One tip:
Don't even think about using a P4 Celeron for this job.
I temporarily got my hands on a 1.7 GHz Celeron here. The result is very disappointing. :(
Even a 800 MHz Duron beats the Celeron for good!

rosebud
02-18-2003, 12:50 PM
I experience the same with a P4 2.5 GHz.
It is about 15% slower than my Athlon 1 GHz.

Any idea how that can be??

Mierin
02-21-2003, 01:01 PM
I was reading an RSA-72 web page yesterday and it said that P4 does not have a hardware instruction for rotate. If the seiveing makes extensive use of rotate that would explain it.

I think P4 is moving to RISC design with microcode to handle the removed instructions. If this is true the microcode would have to do two shifts and one OR to simulate. You could get 1/3 expected speed.

Don't know how true all this is. Can anyone point me to a timing chart for P4 instructions. Is there a huge penalty for ROT?

DP

cjohnsto
02-21-2003, 08:03 PM
The p4 has made lots of changes like that another one I know of is that now add with carry is approx 7 cycles (cant remember exactly) but an add is only 1 cycle.
This is bad for numbers > 2^32. I'm not sure if this effects factoring or not by my intuition says that it would have at least a small impact.
Craig.

expinete
02-22-2003, 07:13 AM
I have sieve 655G (Moo_the_Cow already had done it) And I found a new factor

655009757159 | 55459*2^13313098+1


How could his program skip this number? Could it be a bug in the program??


I'm using sobsieve 1.24 under Win2000. If anyone can test if this is a real factor...


(55459*2^13313098 has another factor in 330G...it's just a problem with an old sob.dat)

Moo_the_cow
02-22-2003, 06:25 PM
What!?! I can't believe it. I'm using SoBSieve v.1.22, so there
are 3 possibilities (which you may already have thought of):
1.) You are using an old SoB.dat file and have just
found a duplicate (most likely)
2.) There is a bug in my version
3.) My CPU is not calculating accurately (least likely, because
I don't overclock my CPUs and this has never happened
before)

Nuri
02-22-2003, 07:43 PM
It's a duplicate, I see no problem there. That kind of a thing happened to me too within the first two weeks of the sieving when we were frequently changing clients (and their versions) and Sob.dat files. I was cross-checking for a small range everytime I switched clients, and mistakenly used an older SoB.dat file on one of the clients (and panicked because the client using the older one had two additional factors when compared to the other one).

Anyway, to clarify 655009757159 | 55459*2^13313098+1 issue, I checked for 655G-655.01G range on all of the four SoB.dat files I have.

As far as I could have followed, the SoB.dat file was updated three times, and these files are;

1- 4595 KB, dated 18/1/03,
2- 4523 KB, dated 22/1/03,
3- 4178 KB, dated 25/1/03, and
4- 3797 KB, dated 30/1/03.

All of the first three has 655009757159 | 55459*2^13313098+1 as a factor, whereas the last one does not have it.

So, probably Moo_the_cow is using the most current SoB.dat file, while expinete uses either one of the first three.

expinete, as you mentioned, 330714997043 | 55459*2^13313098+1 is another factor. It was found and submitted by Louie sometime between 25/01/03 and 30/01/03.

priwo
02-23-2003, 04:23 AM
I am using SoBSieve 1.22 and get the message

WARNING: 64 bit integers are badly aligned as qwords!

do I have a problem ? should I stop sieving ?

MikeH
02-23-2003, 05:45 AM
expinete wrote
I have sieve 655G (Moo_the_Cow already had done it) And I found a new factor
According to the sieve co-ordination thread, Moo has not yet declared this area as complete, so really this is one area you shouldn't be playing with. My hole finder reports the following (using the results.txt from 10 minutes ago) for that range.

658.71G - 660.00G, size: 1.29G, est fact: 48 (658712581133-660004372939)

Yesterday this hole was bigger (3.35G), so in addition to the other suggestions offered above, there is a good chance that Moo simply hadn't submitted anything for this range when you tried.

Mike.

MikeH
02-23-2003, 06:06 AM
While we are on the subject of holes, I have attached an output from my hole finder for the range 200G - 5000G.

I have checked the possible holes against the co-ordination thread. Any that are not declared as complete have a single space at the beginning of the line. Those that I have started re-sieving are marked with a '*', any others (those that have no space or '*' at the beginning of the line) are open to takers.

If anyone is interested in checking these (potential) holes, please double check that ranges are declared as complete, and co-ordinate activity on this thread.

Mike.

chvo
02-23-2003, 07:58 AM
I've started on the hole 3911.54G - 3915.14G.

When will SoBSieve 1.24 be posted on the seventeen or bust website?

smh
02-23-2003, 08:39 AM
From the "holes" file


SERIES: 3911.54G - 3915.14G, size: 3.60G, est fact: 29 (3911540412763-3915137233187)
SERIES: 3926.48G - 3930.06G, size: 3.58G, est fact: 32 (3926484634733-3930061625759)
SERIES: 3941.54G - 3945.05G, size: 3.51G, est fact: 25 (3941536383347-3945047319599)
SERIES: 3956.30G - 3960.05G, size: 3.75G, est fact: 36 (3956304590099-3960053131451)
SER+RAT: 3970.03G - 3975.01G, size: 4.98G, est fact: 35 (3970034811803-3975012508177)
SER+RAT: 3985.51G - 3990.08G, size: 4.57G, est fact: 27 (3985510607761-3990079251871)
SER+RAT: 4000.01G - 4005.03G, size: 5.02G, est fact: 35 (4000011208327-4005029348777)
SER+RAT: 4015.30G - 4020.14G, size: 4.85G, est fact: 38 (4015295346311-4020141218923)
SER+RAT: 4030.54G - 4035.38G, size: 4.84G, est fact: 45 (4030542073183-4035379652009)
SER+RAT: 4045.39G - 4050.26G, size: 4.87G, est fact: 33 (4045385419291-4050257987239)
SER+RAT: 4060.46G - 4065.31G, size: 4.85G, est fact: 35 (4060459019593-4065313137757)
SER+RAT: 4075.53G - 4080.03G, size: 4.50G, est fact: 32 (4075530343427-4080031510823)
SER+RAT: 4090.31G - 4095.26G, size: 4.95G, est fact: 36 (4090307684017-4095259707527)
SER+RAT: 4105.45G - 4110.19G, size: 4.75G, est fact: 34 (4105445717053-4110192811271)
SER+RAT: 4119.92G - 4125.10G, size: 5.18G, est fact: 36 (4119922770037-4125103526893)
SER+RAT: 4134.97G - 4140.20G, size: 5.23G, est fact: 31 (4134974381183-4140202239367)
SER+RAT: 4149.40G - 4155.32G, size: 5.91G, est fact: 39 (4149402363289-4155315689257)
SER+RAT: 4164.53G - 4170.08G, size: 5.55G, est fact: 44 (4164530654129-4170082334729)
SER+RAT: 4178.82G - 4185.14G, size: 6.32G, est fact: 59 (4178824552081-4185144302257)
*SER+RAT: 4193.89G - 4200.22G, size: 6.33G, est fact: 54 (4193888050727-4200216868171)

These are all in a range sieved my Louie.

Louie: It looks like something has gone wrong. Did you forget submitting these, or were they not sieved one way or another?

Nuri
02-23-2003, 09:32 AM
I am writing this for coordination purpose.

I have previously chatted with Louie on that issue (the holes in 3900G-4200G range). He is aware of that and is taking care of it. So, they will be submitted soon. No problem there.

Also, I am sieving the holes in the 3650G-3700G range. Just not submitted yet. Waiting for it to finish. Will take a couple of days more.

Regards,

Nuri

MikeH
02-23-2003, 02:33 PM
Nuri et al,

I don't want to duplicate effort, so I have stopped my re-sieving, and submitted the factors. This is how far I'd got:

4193888050727 - 4197649481731
3649834328981 - 3657925328989

For the future, can we please agree on a co-ordinated method of declaring holes, and to whom the re-sieve is assigned.

We are getting very close to n>3M being assigned for PRP tests. I would be really nice to have p<5T complete before PRP assignment. This is something I think no one would have even contemplated two months ago.

And for information, the 5T<p<10T is currently 32% complete. Maybe we can get this to 50% before PRP testing begins? Now wouldn't that be good.

Mike.

Nuri
02-23-2003, 06:31 PM
Hi MikeH,

Of course, there is no problem with that. At the end, we are all trying to reach the same goal.

The reason I did not declare these two ranges (3900G-4200G and 3650G-3700G, both with couple of holes) is that there seemed to be nobody else looking for holes. In other words, I did not think anybody even noticed them. So, I directly contacted Louie not to confuse anyone.

Louie told me that he had scheduled to take care of his 3900G-4200G already, so I just started sieving the 10 holes in 3650-3700 range (one already finished and three are in progress).

And as soon as I noticed that somebody else is looking for holes, I wrote the coordination post above.

If anybody wants to take any of the remaining six holes, just let me know which one you are taking (so that we don't duplicate effort). If not, I am already patching them. Both alternatives are ok for me.

chvo
02-24-2003, 03:13 AM
I'll stop re-sieving too. I didn't get that far though:

3911540412763 - 3912478299240

I've submitted the factors that I've found.

ColinT
02-25-2003, 01:56 PM
I see there is a lot of talk about "Sieving" and Factors". I have no idea what these are. Is it something that becomes obvious when you get your degree in math?

I have run the SoB client for over 300 days. Now I see there is yet another client. When did this start and is it important?

Is there a written explaination of all this? Something that I could understand? I was never good at math. I took an Algebra course once in HS. I can't recall if I passed it. That happened 40 years ago. Maybe they have some new math now.

smh
02-25-2003, 03:19 PM
Don't know about any paper or web page about factoring or sieving, but let me try to explain things a little.

What SoB tries to do is to find prime numbers by performing a probable primility test, but most numbers turn out to be composite. Another way of proving a number is composite is trying to find a factor of that number. What we try to do with sieving is trying to find 'small' prime factors of all the candidate numbers, just like GIMPS does with trial factoring. But instead of trying to factor one candidate at a time, the sieve takes a prime numbers and checks if it divides any of the candidates left.

At this moment we are testing the remaining 12 K values for N is between 3.000.000 en 20.000.000
This is a total of 204 million numbers. Most numbers can be divided by small primes, so we don't have to perform a prp test on them (a single test will take days on the higher N's). At this moment there are about 630.000 numbers left of which the status is unknown. (You can see this number on the project stats page). As long as sieving elliminates more numbers in a given time period then prp testing, it's worthwhile to continue sieving.

There is a sieve program available on the website, but it requires manual configuration, reservation and returning of the results, and you won't find a prime with it.

It's important that a small group continues sieving, but it's certainly not maint to let people stup running the SoB client. Just do what you think is the most fun.

ColinT
02-25-2003, 09:52 PM
Thanks for the info, smh. I'll pass on sieving and continue running the SoB client:)

Moo_the_cow
02-28-2003, 06:38 PM
Hey, where are the new clients? There hasn't been one in
3 weeks :eek: . Come on, I want my SoBSieve 1.25 and
NBeGon 011 (preferably with a >20% speed increase) :)

MikeH
03-01-2003, 08:19 AM
To celebrate the start of the PRP testing of n>3M, here is a little summary of the sieving efforts to date. I have used the latest results200.txt and results.txt to generate the numbers below.

The range 1G<p<200G is 100% complete.
The range 200G<p<5T is ~97% complete.
The range 5T<p<10T is ~48% complete.

Quite an accomplishment!

I've also had a stab at producing some (very unofficial) stats.

I have scored as follows. Each unique factor for a k/n pair scores (p/1T) e.g. a factor where p=825G will score 0.825. Only the lowest p for a k/n factor scores, duplicates at a higher p do not score. Factors for n<3M are not scored. Factors for n>20M are not scored.

Here are the top 5 for users and teams. I've attached the full stats. For column headings, FacU is number of unique factors, FacD is number of duplicate factors, everything else should be reasonably obvious.

UserId UserName Score ( %) FacU ( %) FacD n<3M 3M<n<20M n>20M
1 louie 48041.38 (28.98) 144251 (53.62) 7197 507 143744 0
0 unknown 16607.47 (10.02) 9392 ( 3.49) 1473 108 9284 0
1608 MikeH 12826.83 ( 7.74) 8123 ( 3.02) 1466 0 8123 0
1577 OrkunBanuTST 7427.97 ( 4.48) 8843 ( 3.29) 677 0 8843 0
627 Antiroach 6368.77 ( 3.84) 1795 ( 0.67) 136 0 1795 0

TeamId TeamName Score ( %) FacU ( %) FacD n<3M 3M<n<20M n>20M
3 Michigan 53512.53 (32.28) 155805 (57.91) 8406 678 155127 0
0 unknown 46657.17 (28.15) 39008 (14.50) 4532 674 38334 0
81 TeamPrimeRib 10243.45 ( 6.18) 14704 ( 5.47) 1352 14 14690 0
9 Rechenkraft 9565.25 ( 5.77) 11339 ( 4.21) 1505 3 11336 0
53 Anandtech 6914.17 ( 4.17) 3390 ( 1.26) 461 21 3369 0

Mystwalker
03-01-2003, 11:56 AM
There seems to be a slight stats glitch (on the SoB HP):

According to the stats, 3111 tests are pending and there are 3102 tests <3M remaining. That would indicate that there already n's >3M handed out.
But a look at the test window (which is 7 minutes younger) shows that we're still 4K away from that range - which translates in 100-200 tests still to do...

Joe O
03-01-2003, 01:18 PM
Nice job MikeH!

The only thing that I would like to point out to everyone, is that some of the "holes" are still being worked on. i.e. the people reserving them have not reported them complete. Louie is taking care of another range (see the posts up above (7 or8 or so)). So it may be to soon to start resieving the holes! Maybe, let the people who did it originally, redo it, for now at least?

MikeH
03-01-2003, 01:44 PM
Thanks Joe,

You're dead right on the 'holes'. I'd only included the files just so that anyone that's interested could see where the missing 3% and 52%'s were. I haven't looked at the 5G-10G file, but certainly for p<5G I think there are no holes which are unaccounted for.

Many thanks for pointing that out. The last thing I'd encourage is duplication of effort.

Mike.

Nuri
03-01-2003, 07:20 PM
Very nice job Mike.

Both the hole finder and the stats are cool. :thumbs:

Mystwalker
03-02-2003, 10:58 AM
If someone likes, he can recheck the range 7086 - 7089.

I had a hole between 7086093056041 and 7088983041311 -> almost 2.8 G big. The distance between to successive factors is usually a lot smaller...

paul.jobling
03-04-2003, 11:08 AM
Originally posted by Moo_the_cow
Hey, where are the new clients? There hasn't been one in
3 weeks :eek: . Come on, I want my SoBSieve 1.25 and
NBeGon 011 (preferably with a >20% speed increase) :)

I have got a SoBSieve 1.25 here that I have developed... it is a little faster than 1.24 - around 15% faster in tests with this 466 MHz PIII at p=1T.

;)

I suppose I ought to release it now!

Regards,

Paul.

expinete
03-04-2003, 01:55 PM
If someone likes, he can recheck the range 7086 - 7089.


I'll check that and tell you if I found something

Troodon
03-04-2003, 02:33 PM
SoBSieve 1.25 is slower than the console version and takes much more memory (and more than 1.24), at least on my Tualatin.
I would like to see a console version where I could change the time between screen output :p .
Thanks Paul!

Joe O
03-04-2003, 03:00 PM
I didn't check the memory usage, but on a PIII/500 and a Celeron/450 it's slower than the console version.
Any chance for a faster console version?

Nuri
03-04-2003, 04:30 PM
SoBSieve 1.25 is 1% faster than 1.24 on my P4-1700, at p=6T

jjjjL
03-04-2003, 06:48 PM
v1.25 is 7% faster than 1.24 for me.

Someone should do a graph of the relative speeds of all the sieves.

It would be neat if someone made a graph showing the first time each version of each sieve (nbegon, sobsieve, sobsieve console) was posted to the forum vs how long it takes for that version to sieve a 1G range of factors (say around p=10T) on a constant arch (with a properly set alpha factor for each version just to be fair).

Wouldn't be too much work. All the versions and dates are on the forum or Phil's site.

Everyone loves time vs. time graphs. :)

-Louie

paul.jobling
03-05-2003, 04:42 AM
Originally posted by Joe O
I didn't check the memory usage, but on a PIII/500 and a Celeron/450 it's slower than the console version.
Any chance for a faster console version?

Ah, you are all speed freaks rather than GUI freaks! Well, that is absolutely fine by me....

...I just did a build of the Console version, and it was no faster. That is not much of a surprise, really, as a lot of the extra speed came from changing the compilation options (I actually turned *off* compiler optimisations!).


Regards,

Paul.

Mystwalker
03-05-2003, 11:32 AM
Maybe you can enhance the "comfort" of the console version if speed is already at max... :)

Some things that came to my mind after some weeks of using:

- shorter save times
Right now I have to manually copy the current P value into the config file - at least on that computers that don't hibernate...
Alternatively a key combination to write the current status to file and exit - if implementing this handler doesn't affect performance.

- option to fine-tune alpha setting
I guess here one can squeeze out an additional percent or two. According to my observations, the optimal value depends on the system architecture and the search range seems to have a slight influence, too.

- fewer status outputs (optional)
As speed is really adorable, I get 1.5 status updates per second on some systems. I'm not sure if there will be a performance improvement when the output rate is reduced, but I guess it's worth a try.

- output of "clean" factor file
When computation of the set rate is completed, the program could create a new file and paste the factors of the config file. Maybe an overwrite protection is ideal to prevent overwriting of factor files not yet submitted...

paul.jobling
03-05-2003, 11:58 AM
Some things that came to my mind after some weeks of using:

- shorter save times
Right now I have to manually copy the current P value into the config file - at least on that computers that don't hibernate...
Alternatively a key combination to write the current status to file and exit - if implementing this handler doesn't affect performance.


OK, that might be possible.



- option to fine-tune alpha setting
I guess here one can squeeze out an additional percent or two. According to my observations, the optimal value depends on the system architecture and the search range seems to have a slight influence, too.


Ah, there is a -a=<value> command line switch to change the alpha. 2.5 is used by default. Ah, but that doesn't seem to work...



- fewer status outputs (optional)
As speed is really adorable, I get 1.5 status updates per second on some systems. I'm not sure if there will be a performance improvement when the output rate is reduced, but I guess it's worth a try.


Hardly *any* of the time is spent doing IO. I could reduce it, but the net effect would be about a 0.001% gain.




- output of "clean" factor file
When computation of the set rate is completed, the program could create a new file and paste the factors of the config file. Maybe an overwrite protection is ideal to prevent overwriting of factor files not yet submitted...

There is another approach that I could use, but backward-compatibility would be a problem. Are you not using the small utility that was posted up here to clean the file before submitting it?

Regards,

Paul.

expinete
03-05-2003, 12:38 PM
If someone likes, he can recheck the range 7086 - 7089.


I found another factor in the interval
: 7088930748583 | 24737*2^14280367+1

But I think it may be a duplicate

smh
03-05-2003, 03:22 PM
Thats strange, 24737*2^14280367+1 is divisible by 5 (according to newpgen 2.8), so this number shouldn't show up in any dat file.:confused:

Mystwalker
03-05-2003, 08:02 PM
Are you not using the small utility that was posted up here to clean the file before submitting it?

If you mean the Java application - I wrote it. :D
Sure, it works, but I thought it would be even more comfortable to skip that part. Would be one double-click less... :p

ceselb
03-05-2003, 08:40 PM
Just copy the results directly from the SoBSieve window. Why complicate things unnecessarily?

jjjjL
03-06-2003, 01:44 AM
Originally posted by smh
Thats strange, 24737*2^14280367+1 is divisible by 5 (according to newpgen 2.8), so this number shouldn't show up in any dat file.:confused:

That number isn't divisible by 5. I just checked it with GMP. Maybe you checked it without the +1 or some other NewPGen mode?

Here are the factors in the db for it.

58828348459 | 24737?2^14280367+1
7088930748583 | 24737?2^14280367+1

-Louie

smh
03-06-2003, 03:10 AM
:bang: Don't know what i've been doing wrong. I couldn't imagine that it was divisible by five.

Just checked again with newpgen 2.70, and it doesn't divide by 5. newpgen 2.80 crashed while testing.

paul.jobling
03-06-2003, 04:52 AM
Originally posted by smh
:bang: Don't know what i've been doing wrong. I couldn't imagine that it was divisible by five.

Just checked again with newpgen 2.70, and it doesn't divide by 5. newpgen 2.80 crashed while testing.

You are right about NewPGen 2.80 crashing - thanks for spotting that. It happens when there is only 1 number in the fixed k sieve. I have fixed it now.

Regards,

Paul.

Joe O
03-06-2003, 11:25 AM
How about a "Done" in the SobStatus.dat file when it finishes?
It does display it on the console!

Mystwalker
03-09-2003, 08:07 PM
I found another factor in the interval:
7088930748583 | 24737*2^14280367+1

Ok, I just rechecked that factor on the very same machine that let it slip the first time. Now it did find the factor.
Seems like it was a computing error...

THX expinete:cheers:

OberonBob
03-10-2003, 09:14 AM
Maybe I am being presumptous, but I started my little 533 Alpha on the 3560 - 3570 range that Sonicbadger may or may finish. it will take me a week to do that range, and maybe Sonicbadger will have it done by then, but if not, I will. The first factor in that range that I found in the first ten minutes that had not been reported yet, so I am going to bet that it isn't being done.


Someone tell me to stop, and I will. I understand that the lower ranges have more factors, and so, I figure this is more productive than picking up a block in the 9000+ region.


--
OberonBob
Team ORU

MikeH
03-10-2003, 02:47 PM
(originally posted on the n<3M double check thread)

I have updated and improved (slightly) my unofficial sieving stats (http://www.aooq73.dsl.pipex.com/scores.txt) . They can be found here (http://www.aooq73.dsl.pipex.com/scores.txt) .

I now have an 'excluded' category. I am using a 100<n<50M Sob.dat file sieved to 1G as the test for validity. Excluded's don't score.

Scoring is also slightly different:
n < 3M, score = p/1T * 0.5 * ((n*n)/(3M * 3M))
3M < n < 20, score = p/1T
n > 20M, score = p/1T *0.05
duplicates score = score * 0.01

This won't be updated daily, but I'll make sure it's more often than weekly.

Mike.

Troodon
03-10-2003, 04:21 PM
I think that sonicbadger's range should be sieved ASAP because we're already at n> 3 M.

Nuri
03-12-2003, 04:14 PM
This post is for coordination purpose.

I suspect that there is a 25G hole within the following range.

pmin=8024924676959, pmax=8050234552069

8010G-8050G is marked as complete (8010 - 8050 jimmy [complete]), and no new factors were submitted for 8025-8050 range within the last two results.txt files.

I'll wait for another day to avoid duplication of work (ie. to be sure that jimmy did not submit the factors for that range and/or somebody else already started resieving without notice).

If any one of you are interested in resieving that range instead of me, please feel free to post below and I'll hand it over. If not, I'll be sieving it anyway.

Regards,

Nuri

Troodon
03-12-2003, 06:06 PM
I want to help you. I propose to split it in two parts, one for you and the another one for me - this way it would be finished faster. Do you like this proposal?

Nuri
03-12-2003, 06:25 PM
Ok.

I will reduce my range to 8025-8035. Plese feel free to take 8035-8050 Troodon.

But please, wait one more day before starting the resieve (to be sure that it was not sieved).

Happy sieving.

Troodon
03-12-2003, 06:36 PM
OK, I'll start (re)sieving the range 8035 - 8050 after 86400 seconds :p . Thanks!

Nuri
03-12-2003, 06:54 PM
Be sure to check the results.txt file (http://www.seventeenorbust.com/sieve/results.txt) tomorrow before starting. ;)

jimmy
03-13-2003, 11:22 AM
Hi,

I'm sure, I sieved this range.

I splitted the range: 8010 - 8025 were sieved at home and 8025 - 8050 were sieved at work. So I will check my comp at work tomorrow an will answer tomorrow evening (germany time). So please wait resievieng until this time.

I' m sorry about the trouble,
Jimmy

BTW: My English is not the best.... :spank:

Nuri
03-13-2003, 11:48 AM
It's no trouble at all Jimmy.

It's nice to hear that the range is already sieved. We saved roughly 70 candidates more without any additional work. :thumbs:

Please let us know the result.

Happy sieving all.

Nuri

jimmy
03-14-2003, 09:22 AM
I've found the Dat-file (8025-8050) and will submit it this weekend. :)

Jimmy

Joe O
03-15-2003, 10:41 AM
Version 1.25 running overnight averaged 37443 p/sec for a 6 hour run. This morning it had a sustained average of 37792 p/sec for a 90 minute run.
The console version got 37401 p/sec for a 3 hour run. The highest hour was only 37465 p/sec. There was no other activity during this hour, and only1 -2 minutes of Windows Explorer during the other 2 hours. All of this on a PIII/500 running Win98SE. I will have to recheck the Celeron/450 as well. My previous comments were for one hour runs, where the console version edged out the new release by .5%
Just to make it clear, version 1.25 edged out the console version for this timing test! Not by much, but every bit counts.









Originally posted by Joe O
I didn't check the memory usage, but on a PIII/500 and a Celeron/450 it's slower than the console version.
Any chance for a faster console version?

MikeH
03-16-2003, 12:08 PM
More sieve sieve stats (http://www.aooq73.dsl.pipex.com/scores.txt) changes.

I'm now using the range information from the coordination threads to identify the factors that were submitted when users weren't logged in.

As a result we have 10 new users in the stats. Welcome. Sorry to anyone that's fallen a few places as a result :(

It also means I claim the number two spot from the pesky [unknown]. ;)

On the subject of hole finding, I'm sure you'll all be glad to hear that we are hole free right up to 7375G (Nuri, my fellow hole finder, do you agree?). Well done everyone. :|party|:

Nuri
03-16-2003, 12:57 PM
Hi Mike,

I just checked for holes yesterday, so I can clearly comment on that. I agree with you, but I want to rephrase it to something like, "even if there might be some tiny holes below 7375G, it won't be worth the effort to check for them".

To be more specific,

I'm pretty sure we've patched every hole below 7000G. There are two regions of strange characteristics (too many holes of 1G<p<2G very close to each other, with no factors) between 7000G and 7375G. Even if all of them were actual holes, there shouldn't be more than 30 new factors in total.

I am currently considering if it's worth the try. I'll let you know if I decide to give them a shot.

For 7375G<p, there were some ranges marked as [Complete], but that obviously lacked submitted factors as of yesterday. I am sure these were sieved but not submitted (like jimmy's case), and will be submitted soon.

Anyway, congratulations everybody. Keep up the good work. :|party|:

Nuri
03-16-2003, 01:04 PM
Originally posted by MikeH
I'm now using the range information from the coordination threads to identify the factors that were submitted when users weren't logged in.

But this is only for the factors with userid null, right?

MikeH
03-16-2003, 02:08 PM
Nuri said
But this is only for the factors with userid null, right?
Correct.

Moo_the_cow
03-16-2003, 02:55 PM
Mike, could you merge the last 2 sieve accounts
(Moo_the_cow and Mooeythecow) into the
Moothecow account? The last 2 sieve accounts were
test accounts that I used before I got my password.

P.S. Why did you id the unknown's? You made me drop
4 places (from 23rd to 27th) :mad:

priwo
03-16-2003, 03:19 PM
thank you mike that i am in your sieve stats, i did not notice that i should be logged in while submitting factors

sorry for you moo

MikeH
03-16-2003, 04:05 PM
Moo said
Mike, could you merge the last 2 sieve accounts 'Tis done. Sorry it hasn't improved your position. :cry:

I've also merged

54 peterzal 23.49 ( 0.01) 7 ( 0.00)

into

3 Antiroach (peterzal) 9732.91 ( 4.74) 2253 ( 0.73)

Is there anyone else? Either to merge, or who is not identified in the scores by their forum alias (where SB name is different)?


priwo said
thank you mike that i am in your sieve stats, My pleasure.:thumbs:

Moo_the_cow
03-16-2003, 04:41 PM
I also suggest merging account #54 (Reto) into
account #12 (biwema) since I'm guessing
that Reto Keiser's forum name is biwema
(Reto, please correct me if I'm wrong)
If you're wondering why I'm guessing that,
it's because I saw this in the "Sieve: Distribution
of Factors vd. p value" thread, page 2

quote:
_________________________________________________
Is there a p-1 factorer available which can handle 3Mbit siepinski numbers? I would be quite much fun to find huge factors.
Maybe I should concentrate on smaller p as it is quite much more effective. We will see.

biwema (Reto Keiser)
__________________________________________________

MikeH
03-16-2003, 05:04 PM
Thanks Moo. I'd never have noticed that one.

Also done.

Mike.

Joe O
03-19-2003, 11:14 AM
Well version 1.25 wins again. On one side of the dual there was a 1.32% improvement over the console version, and on the other side a 1.55% improvement. These were multi-day averages. Yes the memory usage is significantly higher (2596k vs 543k), but it is faster. The paging file is at 43MB, well below the recommended 203MB. There is 192MB of PC100 memory.

paul.jobling
03-19-2003, 11:30 AM
Hi Joe,

I am interested in rthe extra speed in 1.25 ovewr the console version. I attach a console version of 1.25 - can you see if that is any faster?

Regards,

Paul.

Joe O
03-19-2003, 12:03 PM
Paul, I've got the new one running on both sides of the Celeron. Memory usage is 1708KB/1664KB. I'll let it run for a couple of hours, maybe overnight, and see what happens. Joe.

Mystwalker
03-20-2003, 08:15 AM
Just out of curiousity:
What happens when someone submits a factor for a k/n pair that is currently PRP tested? Will this test be aborted the next time the client tries to submit a block?

Joe O
03-20-2003, 10:25 AM
Paul,
The new console version is faster than Version 1.25 by 1.53% on one side of the dual and 1.67% faster on the other side.

Averaging both sides we get:

New Console 97309.21869 1.60% improvement over
Version 1.25 95773.41405 1.44% improvement over
Old console 94417.23113


The New Console is 3.0% faster than the Old Console.

Joe.

ps Care to guess which one I will be using?

paul.jobling
03-20-2003, 10:30 AM
Originally posted by Joe O
Paul,
The new console version is faster than Version 1.25 by 1.53% on one side of the dual and 1.67% faster on the other side.

The New Console is 3.0% faster than the Old Console.

ps Care to guess which one I will be using?

3%, so nothing amazing. I just noticed that I built it with the old optimizations, so here is another build using the new optimizations. Is this any faster?

Regards,

Paul.

Mystwalker
03-20-2003, 11:14 AM
Well, on my P3-m, the new version is even a little bit slower than the old one.
I'm not sure how they compare on my Duron, as there are a bunch of programs running ATM, so there is no detailed speed testing possible...
On a first look, the speed is not that different, maybe a bit faster? I don't know for sure, though. :(

Troodon
03-20-2003, 12:50 PM
On my Tualatin, the console version 1.25 (the first one) seems to be a bit faster than the console v 1.24.

Joe O
03-20-2003, 02:26 PM
Paul, the preliminary results (2hr run) don't look good.

93689.61022 -3.7% Slower than the previous version.

This is the slowest one of all. Now I had inadvertantly left task manager running, so I'm going to run another 2 hours to see if that made any difference. If not, ....


The final results (3hr run) are not much better.

93962.34731 -3.44% Slower than the previous version.

Troodon
03-21-2003, 01:24 PM
Originally posted by Mystwalker
Just out of curiousity:
What happens when someone submits a factor for a k/n pair that is currently PRP tested? Will this test be aborted the next time the client tries to submit a block?

I would like to see that question answered, too. For example, I'm sieving 10660-11000 and the last factor found is 10694103276187 | 22699*2^3000430+1 (and currently it's the smallest one found in that range).

mklasson
03-22-2003, 11:52 AM
I'm curious as to how your sieving is actually done. I'm toying with a little siever myself, but it's quite slowish...

Do you try all p \equiv 1,3,7,9 (mod 10), or do you try to factor p for a while first or something like that?

After that, what do you do next? It seems to me you can't just brute-force your way through the ~660k n values and actually calculate the modulus for each, seeing as how you process somewhere around 100k p values per second. Speaking of which, how's that rate calculated? Do you count the p values actually tested, or is it p_max minus p_min, or something else?

I've also thought about calculating d and r, such that p | k*2^(d+r*t), t \in N, but I can't see how to do that fast enough either.

Could you share some information?

paul.jobling
03-24-2003, 04:54 AM
Originally posted by mklasson
I'm curious as to how your sieving is actually done. I'm toying with a little siever myself, but it's quite slowish...

Do you try all p \equiv 1,3,7,9 (mod 10), or do you try to factor p for a while first or something like that?


A windowed Sieve Of Eratosthenes is used, and the p values that are generated are all primes. This can be done quickly, using a bitmap.



After that, what do you do next? It seems to me you can't just brute-force your way through the ~660k n values and actually calculate the modulus for each, seeing as how you process somewhere around 100k p values per second.


Find information on the "Giant steps-baby steps discrete logarithm algorithm". Basically, it finds those n values such that k.2^n+1 = 0 (mod p), but very quickly.



Speaking of which, how's that rate calculated? Do you count the p values actually tested, or is it p_max minus p_min, or something else?


It is indeed p_max - p_min.



I've also thought about calculating d and r, such that p | k*2^(d+r*t), t \in N, but I can't see how to do that fast enough either.

Could you share some information?


That is the algorithm that is used, which is why the time is proportional to the square root of n_max-n_min.

Okay, here are the details:

We want to find n such that p | k.2^n+1

Express n as nmin+a+b*arange, with 0<=a<arange; 0<=b<brange.

Then

p | k.2^(nmin+a+b*arange)+1 ==>

-k.2^(nmin+a+b*arange) = 0 (mod p) ==>

k. -2^(nmin+a) = (1/2)^(b*arange) (mod p)

So what I do is generate (1/2)^(b*arange), for b=0 to brange-1 (the RHS). These values are saved away.

Then for each k, I generate k.-2^(nmin+a) for a=0 to arange-1 (the LHS), and as I calculate each value I see if it was on the RHS. If it was then the a and b values tell me the n value to remove from the sieve.

This is all done using tight assembler for extra performance.

Regards,

Paul.

Joe O
03-24-2003, 08:31 AM
Paul,
Any chance for a faster sieving program?

paul.jobling
03-24-2003, 09:15 AM
Originally posted by Joe O
Paul,
Any chance for a faster sieving program?

:D LOL, I would if I could!

Regards,

Paul.

mklasson
03-25-2003, 02:36 PM
Paul,

thanks for the info. I figured out the same scheme (more or less) later on, but it's good to hear the current approach is the same thing. Or bad... a new speedup would've been nice. :)

I suppose it's really pointless to ask, as I'm sure you've tried to optimize the living daylight out of your siever already. But anyway: you've selected optimal values for arange and brange, taking into consideration the different costs for one step of
"k.-2^(nmin+a) for a=0 to arange-1"
(fastest implemented by having 12 initial values and just leftshifting them by 1 and using sub for the mod?)
vs one step of
"(1/2)^(b*arange), for b=0 to brange-1"
and so on?

Again, I don't mean to belittle your effort, I'm just hoping you've forgotten something. :)

Cheers,
Mikael

paul.jobling
03-26-2003, 04:32 AM
Originally posted by mklasson

you've selected optimal values for arange and brange, taking into consideration the different costs for one step of
"k.-2^(nmin+a) for a=0 to arange-1"

vs one step of

"(1/2)^(b*arange), for b=0 to brange-1"


Mikael,

The 'alpha' value is the ratio between arange and brange.

No, I do not double using a leftshift and sub; instead I use a 64 bit a=b*c mod d. That is because for these particular k values, the n values are sparse - if you look you will see that the remaining n values for a particular k are all =3 mod 24 (for example), or something similar. So I perform a multiply by 2^24 rather than performing 24 doublings that will not get rid of any n values. I can multiply by 2^24 or 2^48 or 2^36 faster than I can do 24/48/36 leftshifts and subtractions.

Regards,

Paul.

mklasson
03-26-2003, 10:01 AM
Paul,

ok, I'll just shut up now then. :)

Cheers,
Mikael

mklasson
03-27-2003, 09:08 PM
Paul...

say, have you thought about something like this:

we know 2 | p-1. Therefore calculate 2^((p-1)/2) mod p, and see if it's == 1. If it is, then for each k calculate (p-k^-1)^((p-1)/2) mod p and see if it's == 1. If it isn't, then you can safely ignore that k for this p.

You could then optimize arange and brange according to the number of k's left.

Assuming the cost of calculating the t^((p-1)/2) is negligible, and that arange and brange are optimized to minimize remaining_k*arange+brange (probably not a very optimal formulation), the speedup seems to land somewhere around 15%. About 25% of the k values are discarded on average.

It might even be worth trial factoring p-1 a little bit to get a potentially better estimate of order(2,p), and hence increase the chances of eliminating a k. When the order of 2 is known, the speedup hovers somewhere around 30% instead of 15.

Cheers,
Mikael

paul.jobling
03-28-2003, 04:49 AM
Mikael,

Another thing that the code does is to perform a fast Kroneker test. For each k there are certain p values that can never divide any of the k.2^n+1 values, and this is based on the Kroneker symbol between the p and the k:

For odd n, p can divide k.2^n+1 iff Kroneker(-2k/p)=1.
For even n, p can divide k.2^n+1 iff Kroneker(-k/p)=1.

When that is combined with the observation that for these values of k, p=3 removes either all of the odd or all of the even values of n, we can perform just one Kroneker test per k per p.

Is this the equivalent of your observation, or have you spotted something more?

Regards,

Paul.

Joe O
03-28-2003, 09:16 AM
Originally posted by mklasson
Paul...

say, have you thought about something like this:

we know 2 | p-1. Therefore calculate 2^((p-1)/2) mod p, and see if it's == 1. If it is, then for each k calculate (p-k^-1)^((p-1)/2) mod p and see if it's == 1. If it isn't, then you can safely ignore that k for this p.

<snip>

Cheers,
Mikael

2^((p-1)/2) == 1 mod p <=> 2 is a Quadratic residue of p
(p-k^-1)^((p-1)/2) == 1 mod p <=> (p-k^-1) is a Quadratic residue of p

So you are saying if 2 is a QR of p then (p-k^-1) must be a QR of p or we can ignore that k for this p.

What if 2 is not a QR of p, ie 2^((p-1)/2) == -1 mod p?
Then what? We can't eliminate the k for this p, or can we require that (p-k^-1) also not be a QR of p in order for us to have to consider that k for this p?

Mystwalker
03-28-2003, 11:16 AM
:|ot|::
I don't know if you heard about this (http://www.nature.com/nsu/030317/030317-13.html) new discovery concerning prime numbers (the reference is maybe more in-depth). What do you think about it?


Oh, btw:
What is the biggest n factored so far? I have 19998058 - though not submitted yet. I could look it up myself, but ATM I'm, too lazy to rewrite my programs to look after that. :D

Joe O
03-28-2003, 11:55 AM
As of March 22, 19998825 was the largest N factored. Yours would have been the third largest.

mklasson
03-28-2003, 02:05 PM
Paul,


For odd n, p can divide k.2^n+1 iff Kroneker(-2k/p)=1.
For even n, p can divide k.2^n+1 iff Kroneker(-k/p)=1.

The Kronecker symbol is just a Legendre when the denominator is prime, right?

Shouldn't it be
For odd n, p can divide k*2^n+1 iff Kronecker(-2*k^-1 / p)=1.
For even n, p can divide k*2^n+1 iff Kronecker(-k^-1 / p)=1.
? Or wait, I figure our expressions have to be equal due to the product of them being Kronecker(1 / p) = 1.

But also, is the "iff" really true? I thought you meant that:
For n even:
k*2^n+1 = 0 (mod p)
k*2^n = -1 (mod p)
-k*2^n = 1 (mod p)
so the invers of -k has to be a QR.
And for n odd:
-k*2*2^(n-1) = 1 (mod p)
so the invers of -2k must be a QR.
But I don't see why the other direction of your "iff" holds. Why does the QR-hood mean p _can_ divide some k*2^n+1?
Am I misunderstanding you?


Is this the equivalent of your observation, or have you spotted something more?
Don't know yet. We'll see once this is cleared up. Also, see my post to Joe O coming up soon.

Cheers,
Mikael

mklasson
03-28-2003, 02:25 PM
Originally posted by Joe O
2^((p-1)/2) == 1 mod p <=> 2 is a Quadratic residue of p
(p-k^-1)^((p-1)/2) == 1 mod p <=> (p-k^-1) is a Quadratic residue of p

So you are saying if 2 is a QR of p then (p-k^-1) must be a QR of p or we can ignore that k for this p.
Yes, but what I'm really thinking is that for -k^-1 to have a chance of being in the cyclic subgroup generated by 2, the order of -k^-1 in Z_p* must be a factor in the order of 2 in Z_p*. I.e. order(-k^-1,p) | order(2,p).


What if 2 is not a QR of p, ie 2^((p-1)/2) == -1 mod p?
Then what? We can't eliminate the k for this p, or can we require that (p-k^-1) also not be a QR of p in order for us to have to consider that k for this p? [/B]
In that case order(2,p) = p-1 and 2 is a generator for the entire Z_p* group. That means -k^-1 _will_ equal some power of 2.

Cheers,
Mikael

Joe O
03-30-2003, 01:20 AM
The Legendre symbol (n/m) is a number theoretic function which is defined to be equal to ± 1 depending on whether n is a quadratic residue modulo m where m is an odd prime.

The Jacobi symbol (n/m) extends the Legendre symbol (n/m) to m a positive odd integer.

The Kronecker symbol (n/m) extends the Jacobi symbol (n/m) to m an integer.

(n/m) == n^((m-1)/2) mod m, so 2^((p-1)/2) mod p could have been written (2/p) and (p-k^-1)^((p-1)/2) mod p could have been written ((-k^-1)/p).

So Mikael's statement could have been written:
Therefore calculate (2/p), and see if it's 1. If it is, then for each k calculate ((-k^-1)/p) and see if it's 1. If it isn't, then you can safely ignore that k for this p.

This statement is empirically true for our data, but it doesn't really add anything, and it doesn't tell us what to do when (2/p) is -1, as far as I can tell. Sorry, Mikael.

Paul's statements:
For odd n, p can divide k*2^n+1 iff Kroneker(-2k/p)=1.
For even n, p can divide k*2^n+1 iff Kroneker(-k/p)=1. are also empirically true for our data and are conclusive. I haven't found the relevant theorems yet, perhaps Paul will supply them.

By the way, (2/m) = (-1)^((m^2-1)/8) =
1 for m == +/-1 (mod 8)
-1 for m == +/-3 (mod 8)
so it could easily be computed by looking at only 3 bits of m expressed as a binary integer. In our case m would be an odd prime, so we would only need to look at 2 bits since the third bit would alway be 1. If the two bits are equal the result is 1, if they are unequal the result is -1.

mklasson
03-30-2003, 08:34 AM
Joe,


Originally posted by Joe O
(n/m) == n^((m-1)/2) mod m, so 2^((p-1)/2) mod p could have been written (2/p) and (p-k^-1)^((p-1)/2) mod p could have been written ((-k^-1)/p).
Yes, but that leads the thought away from the core idea. The fact that a^((p-1)/2) = Legendre( a | p ) is just a footnote and not the main issue. This is also true:
Find a factor r of p-1. Calculate 2^((p-1)/r) mod p, and see if it's == 1. If it is, then for each k calculate (-k^-1)^((p-1)/r) mod p and see if it's == 1. If it isn't, then you can safely ignore that k for this p.
Optimally, what you want to know is the order of 2 in Z_p*. Then you can substitute that for (p-1)/r and have an even better chance of wacking a k. Obviously, you won't have to calculate 2^order(2,p) then...


So Mikael's statement could have been written:
Therefore calculate (2/p), and see if it's 1. If it is, then for each k calculate ((-k^-1)/p) and see if it's 1. If it isn't, then you can safely ignore that k for this p.
This statement is empirically true for our data, but it doesn't really add anything, and it doesn't tell us what to do when (2/p) is -1, as far as I can tell. Sorry, Mikael.
It's not only empirically true. Did you miss my latest post? The order of an element in a group divides the order of the group. And like I said, if L( 2 | p ) = -1 then 2 is a generator for Z_p*, which means that all values 1..p-1 can be written as powers of two.


By the way, (2/m) = (-1)^((m^2-1)/8) =
1 for m == +/-1 (mod 8)
-1 for m == +/-3 (mod 8)
so it could easily be computed by looking at only 3 bits of m expressed as a binary integer. In our case m would be an odd prime, so we would only need to look at 2 bits since the third bit would alway be 1. If the two bits are equal the result is 1, if they are unequal the result is -1.
Nifty.

Cheers,
Mikael

Joe O
03-30-2003, 10:24 AM
Mikael,

Find a factor r of p-1. Calculate 2^((p-1)/r) mod p, and see if it's == 1. If it is, then for each k calculate (-k^-1)^((p-1)/r) mod p and see if it's == 1. If it isn't, then you can safely ignore that k for this p.
While some people do not consider 1 a factor, r=1 will always work, i.e 2^(p-1) == 1 (mod p). In fact, it may be the only integer that will work when L( 2 | p ) = -1 which is the case that we are considering since you've already taken care of the case L(2|p) = 1 with the test for L((-k^-1)|p)=1.

Regards,
Joe.

mklasson
03-30-2003, 11:20 AM
Joe,


Originally posted by Joe O
While some people do not consider 1 a factor, r=1 will always work, i.e 2^(p-1) == 1 (mod p). In fact, it may be the only integer that will work when L( 2 | p ) = -1 which is the case that we are considering since you've already taken care of the case L(2|p) = 1 with the test for L((-k^-1)|p)=1.
I think you're putting way too much focus on the Legendre symbol. Forget about that for a while. 2^(p-1) = 1 is a totally uninteresting case since we know that _every_ a^(p-1) = 1. r=1 not only _may_ be, but absolutely _is_ the only integer that will work when L( 2 | p ) = -1. That's because L( 2 | p ) = -1 tells you there are p-1 distinct powers of two modulo p.

The test I proposed for (-k^-1)^((p-1)/2) is a weaker test than the full (-k^-1)^order(2,p).

Anyway, here's an example:
p = 73
a = 3 (just a random integer we want to test. Imagine it's some -k^-1)

order(2,p) = 9 tells us that there are 9 distinct powers of two modulo p. They are:
2^0 = 1
...
2^6 = 64
2^7 = 55
2^8 = 37
2^9 = 1
and the pattern just repeats after that.

a^order(2,p) = 3^9 = 46 != 1 (mod p)
Tells us that 3 is not in the cyclic subgroup generated by 2, i.e. no power of two is equivalent to 3 mod p.

EDIT: Oh yeah, L( 3 | p ) = 1, so the simple (-k^-1)^((p-1)/2) test does not eliminate 3 in this case.

Here, http://mathworld.wolfram.com/GroupOrder.html, is a good place to start for more info.

Cheers,
Mikael

Joe O
03-30-2003, 11:43 AM
Mikael, We're probably boring everybody else. I'll reply to you via PM.

paul.jobling
03-31-2003, 03:46 AM
Originally posted by Joe O
Mikael, We're probably boring everybody else. I'll reply to you via PM.

Hey, Joe, Mikael,

You aren't boring me - I am very interested, because if something useful comes out of this I will implement it! Unfortunately I have only had the time to skim the posts rather than think about them, but I will try to find some time to go into them in detail.

Regards,

Paul.

mklasson
03-31-2003, 08:16 PM
Paul,

your "iff" statements are false. That is, the other way isn't true.
Here's a counter example:
p=31
k=26
L( -2*26 | 31 ) = 1
L( -26 | 31 ) = 1
But p does not divide any 26*2^n+1.
-26^-1 = 25
<2> = {1,2,4,8,16}

And now for my oops... L( a | p ) = -1 doesn't imply order(a,p) = p-1, it just tells you order(a,p) does not divide (p-1)/2.

Anyhow, now that we know the kronecker tests don't catch all the impossible k, I'm still proposing a consideration of something like this:
Trial-factor p-1 a little bit to get one or a few small factors. Then establish as small an upper bound u as possible for order(2,p) using the found factors. Next proceed to calculate k^u (mod p) for the different k. Should any of the k^u != 1 we discard that k.

As some sort of "proof of concept", this method applied to p = 31 and provided with the factors 2 and 3 of p-1=30 successfully eliminates all 26 impossible k values, while the "Kronecker tests" miss 10 of them. Granted, this is an extreme case... But still, it seems worth investigating further.

Regards,
Mikael

paul.jobling
04-01-2003, 04:56 AM
Originally posted by mklasson
Paul,

your "iff" statements are false. That is, the other way isn't true.


I see what you mean. I'll check the original - ah, I added them here - you are right, it is just an if.



Trial-factor p-1 a little bit to get one or a few small factors. Then establish as small an upper bound u as possible for order(2,p) using the found factors. Next proceed to calculate k^u (mod p) for the different k. Should any of the k^u != 1 we discard that k.


I'll read Cohen tonight for a fast order(2,p) algorithm. This might be the hard part, as to work properly we need to get the order right. I can find the inverse of k mod p quickly; i can find a^order(2,p) quickly; so it all depends on how quickly I can find order(2,p). If I have to _completely_ factorise p-1, then all bets are probably off at the level that we are at.

I'll let you know. Thanks for a great idea!

Regards,

Paul.

cmprince
04-01-2003, 10:10 AM
Joe, Mikael, Paul,

I haven't thought about things like quadratic residues and modular math in years, and using my math and physics degree to write VBA spreadsheets at work doesn't exactly help to flex my cranial muscles. I can't think of a more interesting discussion to be following along with; don't mistake silence for disinterest!

Thanks for the illuminating discussion,
Chris

paul.jobling
04-02-2003, 10:26 AM
Hi all,

Well, I implemented Mikael's idea, but unfortunately the time spent getting the order of 2 mod p was too long, and overall no time was saved - in fact, the program ran at half the previous speed.

This is what I did:

First, I set order to p-1. This is my guess at the order of 2 mod p.

Then for each of the first 50 small primes, if the prime - call it r - divided p-1 then I checked to see if 2^((p-1)/r) was 1. If it was, then I divided order by r.

If a k passed the Kroneker test, I saw if k^order mod p was 1. If it was not then I did not use that k. Obviously I only tried this if order was not p-1.

I didn't manage to look at Cohen last night to see if I was going about getting the order of 2 mod p in the best way, I will have a look tonight and see.

Many thanks for the good idea though, Mikael, and keep them coming!

Regards,

Paul.

mklasson
04-02-2003, 12:35 PM
Paul,

a couple of thoughts:

trying the first 50 primes is probably too much. I've tested several different counts, and almost all cases are caught by using the first 10 or so primes.

Using the method on the first 1000 primes > 11000*10^9, below are the results. Trial-dividing p-1 with the first u primes eliminates v k-values (in addition to those eliminated by your Legendre tests, that is). For w of the primes, all k-values are eliminated. Table:
u v w
5 1382 92
10 1436 98
20 1466 103
100 1471 104
100000 1471 104
So on average about 1.5 k values are eliminated for each p, and about 10% of the p values are discarded altogether (no possible k left).

The trial division itself probably isn't the most costly part though. Here's an idea for speeding up the power calculating part:
Instead of calculating 2^((p-1)/factor) you can store the found factors, calculate their product d, let lb=(p-1)/d, calculate a=2^lb, and then calculate t=a^(d/factor), setting d=d/factor whenever t=1. At the end, the upper bound for order(2,p) is lb*d.

Btw, you're allowing multiple identical prime factors, right? Seems very important.

Another idea I think you should look into, and it will definitely help this idea along as well, is to dynamically adapt your alpha value to how many k are left. Under the simplifying assumption that one step of arange costs as much as one step of brange, the optimal work when r k-values are left is something like this:
r work in % of the work for r=12
0 0
1 29
2 41
3 50
4 58
5 65
6 71
7 76
8 82
9 87
10 91
11 96
12 100


Many thanks for the good idea though, Mikael, and keep them coming!

I'll try. :)

I don't know what your views are on sharing the source code, but it would certainly help me out if I could try things directly on the regular sobsieve program.

Regards,
Mikael

mklasson
04-03-2003, 10:13 AM
Good news, everyone!

Closer inspection of the remaining n values for each k yields the following.

for k=4847, all n values are equivalent modulo 24... Table:
k t (all remaining n equiv mod t)
4847 24
5359 8
10223 12
19249 36
21181 24
22699 72
24737 24
27653 36
28433 24
33661 24
55459 12
67607 8

Sooo, extending the Kronecker idea to handle t:th power residues instead of just QR we get:
n = a (mod t)
-k*2^n = 1
-k*2^a*2^(n-a) = 1
2^(n-a) = (-k*2^a)^-1
(2^((n-a)/t))^t = (-k*2^a)^-1
i.e. (-k*2^a)^-1 has to be a t:th power residue mod p for p to be able to divide some remaining k*2^n+1.

We check this condition by raising (-k*2^a)^-1 to the (p-1)/gcd(t,p-1):th degree. If the result mod p is different from 1, we discard this k.

The old Kronecker tests are a subset of this and can be fully replaced.

Mandatory happy figures coming up next:
for the first 1000 primes > 11000*10^9, 9015 k values (2956 more than the Kronecker tests) are discarded, and 116 p values are left with no remaining k.
If this idea is combined with my previous one about order(2,p), 365 extra k values are discarded and 83 extra p values are left empty.

For as high p values as we're at now, one can of course precalc -k*2^a for every k except two; 27653 and 22699 -- their a values are 33 and 46. One could still precalc something like -27653*2^10, 2^23, and -22699*2^23.

A newer sob.dat (mine's dated Jan 30) may allow t values to be increased further. Where can I pick up one of those?

I'm just fervently hoping I haven't taken a wrong turn somewhere now...

Regards,
Mikael

paul.jobling
04-03-2003, 11:05 AM
Mikael,

If your math is right you could be on to something.... watch this space.

Regards,

Paul.

paul.jobling
04-04-2003, 05:47 AM
Well, I sat down with a pad last night and convinced myself that Mikael was right, and that I would be able to come up with a fast implementation of this. I think that I have :D

But I need some beta testing to be performed - specifically, I need people to test that when it is given the same data as the last version of SoBSieve, it removes the same {k,n} values.

If you want to do this, send an email to me (paul.jobling at whitecross dot com) and I will send the beta version of the software to you by return.

I see an increase in rate from 75 Kp/sec to 87 Kp/sec. Which is nice...


Regards,

Paul.

paul.jobling
04-04-2003, 11:23 AM
Hi all,

Well, I have got something that is about 45% faster on this PC - which is nice ;)

I implemented Mikael's idea of only looking for t:th powers, which is an extension of the Kroneker condition that I already had. I have got extra speed by using my original Kroneker code when necessary, and also by doing as little work as possible.

You may find that you need to slightly modify alpha.

I attach a zip file containing both programs - the windows and the console version. The windows version is actually *faster* than the console version now, which is odd.

EDIT: there is a bug somewhere! I have removed the link to the doownload and I will post it back when I find the bug.

Regards,

Paul.