PDA

View Full Version : P-1 factorer



Pages : [1] 2

jjjjL
06-09-2003, 01:52 PM
i'm gonna skip through explaining what this is and get right to what i'm up to.

basically, i hacked up the GIMPS P-1 factoring code and made it do proth numbers. well, maybe.

the thing is, I haven't been able to do much serious testing of it because i have no way to know what it should find for which values.

it has found factors on some numbers but all of them are rather trivial factors (ie 5, 7, 11).

if you know how to test this, download it and report back when you get some results. i'd be very impressed if you found a factor for a currently unfactored number around n=4M.

program syntax is:

ecm.exe <k> <n-start> <n-quit> <B1> <B2>

so if you wanted to check 4847*2^4000000+1, you'd do

ecm.exe 4847 4000000 4000001 1000000 40000000

if you make n-quit larger by more than 1, it will step though multiple "n" as a quick way to batch things for testing.

you'll probably have to know a bit about P-1 factoring to test this correctly. if you feel you know a lot and have visual studio installed, email me and i'll send you the code for you to look though. it's a derivative of what's contained in emc.c in the GIMPS source located at http://www.mersenne.org/gimps/source23.zip

so let me know if you think this works.

http://www-personal.engin.umich.edu/~lhelm/p-1%20factorer.zip

biwema
06-10-2003, 05:40 AM
Hi Louie

it's a good idea to adapt the gimps p-1 code for sierpinski factoring.
Well done!

Nevertheless, I think, something does not work optimally, but I don't know what.

ecm.exe 24737 4559143 4559144 8000 250000

should find

factor: 1000225084123
(1000225084122 = 2*3*3*3*11*7523*223831)

but it does not find it.
i jusi tried to rediscover that factor for testing purposes.

biwema

jjjjL
06-10-2003, 06:25 AM
thanks biwema, that's a good test point.

i'm not too surprised it's not working. i haven't been able to make it find anything larger than 17 so far.

i've verified that it is calculating the proth number right. there must be something else flawed about how it's doing the calculations.

do you know by chance what values should be getting passed to the gcd calls for your test number? if i knew some intermediate numbers, it might be easier to figure out where things are going wrong.

is there another P-1 factoring package available to compare it to (or use instead)?

-Louie

smh
06-10-2003, 07:07 AM
The GMP-ECM client can also do P-1 (and P+1). On my PC it crashes when trying it on large numbers but it might help you to figure how the algorithem works.

Source can be found here (http://www.loria.fr/~zimmerma/records/ecm-5.0.html)

jjjjL
06-10-2003, 08:22 AM
do you know of a large number expression converter so i can create files with the numbers in them?

it looks like EMCnet has no expression input. in any case i know a utility like this exists, i just can't find it. can you point me at one so i don't have to write it.

-Louie

actually, i'll probably just build a new version of emcnet that accepts k/n instead of regular numbers.

wblipp
06-10-2003, 09:36 AM
Originally posted by smh
The GMP-ECM client can also do P-1 (and P+1). On my PC it crashes when trying it on large numbers but it might help you to figure how the algorithem works.

Source can be found here (http://www.loria.fr/~zimmerma/records/ecm-5.0.html)

The crashing on large numbers happens because the standard build uses the local memory stack, and this overflows when storing large numbers. According to Paul Zimmermann, the author of GMP-ECM, to build using other memory techniques "use configure --disable-alloca". I don't have a C compiler; I'd be thankful if someone with a C compiler would make available a version of GMP-ECM 5.0 or 5.1-beta that works on Win2K with SoB sized numbers.

Other people asked about this problem and got the diagnosis from Paul Leyland here (http://groups.yahoo.com/group/primenumbers/message/12626) .

jjjjL
06-10-2003, 09:58 AM
Originally posted by wblipp
The crashing on large numbers happens because the standard build uses the local memory stack, and this overflows when storing large numbers. According to Paul Zimmermann, the author of GMP-ECM, to build using other memory techniques "use configure --disable-alloca". I don't have a C compiler; I'd be thankful if someone with a C compiler would make available a version of GMP-ECM 5.0 or 5.1-beta that works on Win2K with SoB sized numbers.

Other people asked about this problem and got the diagnosis from Paul Leyland here (http://groups.yahoo.com/group/primenumbers/message/12626) .

i'm currently playing with GMP 4.1.2 and ECM 5.0 in cygwin right now.

i just got to the crashing part. it wouldn't handle exponents much larger than 100000. having used GMP for the orginal SB, i do remember that configure line. it will take me awhile to rebuild gmp but i'll give that a try. thanks.

-Louie

jjjjL
06-10-2003, 10:55 AM
ok, here's a special version of GMP-ECM that was built with the --disable-alloca.

it also accepts slightly modified files... rather than each line being a number, it pulls numbers in pairs. the first line holds the k, the second the n value. i haven't tried it with multiple values but there's no reason to think it wouldn't work.

i'm gonna add progress reporting to it now. it may not crash, but it doesn't seem very fast either.

-Louie

http://www-personal.engin.umich.edu/~lhelm/ecm-prothnumber-cygwin-athlon.zip

rosebud
06-10-2003, 11:27 AM
ok, tried it with k:24737 n:4559143

this output seems to be right:
GMP-ECM 5.0 [powered by GMP 4.1.2] [ECM]
Input number has 1372444 digits
Using B1=8000, B2=230000, polynomial x^1, sigma=3668824256

but then very soon it's crashing...

Exception: STATUS_STACK_OVERFLOW at eip=00427603
eax=0002F8A0 ebx=007AF658 ecx=005B14A0 edx=007AF64C esi=007AF610 edi=00022C8A
ebp=007AF4C8 esp=007AF498 program=C:\EIGENE DATEIEN\ECM-PROTHNUMBER-CYGWIN-ATHLON\ECM.EXE
cs=016F ds=0177 es=0177 fs=14F7 gs=0000 ss=0177
Stack trace:
Frame Function Args
007AF4C8 00427603 (00FDC058, 00C99310, 00A6CA58, 00DAF790)
007AF4F8 0040E961 (007AF550, 007AF64C, 007AF658, 007AF610)
007AF578 00401509 (007AFBC0, 007AF5F8, 007AF5E0, 007AFBF0)
007AF688 0040321F (007AFBC0, 007AFC00, 007AFBF0, 007AFBD0)
007AFC28 0040809E (00000004, 009E0478, 009E0328, 81976C98)
007AFDB8 610073E8 (610C39E8, FFFFFFFE, 00000014, 610C390C)
007AFE08 610076CD (00000000, 00000000, BFFBB490, 007AFF68)
007AFE28 00427672 (00407A60, 037F0500, 007AFF78, BFF7B9E4)
007AFE38 0040103C (00000000, 81976C98, 005A0000, 006D6345)
007AFF78 BFF7B9E4 (819672B0, 00000008, 81976C98, 00000000)
46465 [main] ECM 514743 handle_exceptions: Exception: STATUS_ACCESS_VIOLATION
47048 [main] ECM 514743 handle_exceptions: Error while dumping state (probably corrupted stack)

wblipp
06-10-2003, 11:43 AM
Originally posted by jjjjL
ok, here's a special version of GMP-ECM that was built with the --disable-alloca.

it also accepts slightly modified files... rather than each line being a number, it pulls numbers in pairs. the first line holds the k, the second the n value.

I think GMP-ECM has expression expansion already. At least the 5.1-beta that I got from the Yahoo PrimeNumbers Group (http://groups.yahoo.com/group/primenumbers/) would accept expressions like 4847*2^300087+1.

jjjjL
06-10-2003, 12:28 PM
well, it did successfully factor 22699*2^2946070+1 after about 1.5 hours. for all i know, it spent the majority of that time printing the cofactor (stupid program printing a 1 million digit number).

B1 = 2500 B2 = 2500

produces the factor

1000809717779 = 2 x 13 x 71 x 131 x 1933 x 2141 + 1

when the P-1 factorer for GIMPS works, does anyone know how long it would take it to perform a similar sized test?

-Louie

ps - i get a seg fault when trying that other larger number too. :(

Moo_the_cow
06-10-2003, 04:57 PM
I don't think that there should even be a P-1 factorer
existing right now, because it seems much more efficient
to simultaneously sieve >500,000 numbers than to sieve
one.

jjjjL
06-10-2003, 05:17 PM
Originally posted by Moo_the_cow
I don't think that there should even be a P-1 factorer
existing right now, because it seems much more efficient
to simultaneously sieve >500,000 numbers than to sieve
one.

i thought that for awhile too.

but once the tests get assigned, it's too late for a factor. and contrary to some people's suggestions, factors NEVER come in while tests are pending. it hasn't happened yet. either the factor gets in before the test gets assigned, or it comes in after it's been finished. so that means that it is going to take someone 3-5 days to test the number if the server gets to it before it's factored.

so it makes sense to spend some portion of 3-5 days trying to factor the number with P-1 or P+1 factoring. the P-1 factoring is going to find factors that we won't find with regular sieveing for years. so if it only takes a couple hours to factor down a number before it gets assigned, that's worth it. as long as the probablity of finding a factor times the time it takes to find them is less then the time it would take to run a primality test, it's better. in fact, since a factor is even better than a proth test (no double check ever required), it may be better to factor this way when possible.

the real problem is making an efficient program to do the factoring. GMP-EMC seems a little weak. looking at the code, there are a bunch of places where things could be optimized. their implementation is so general (and in my opinion, naive) that they could be improved by even a novice programmer. BTW, no disrespect to the authors of GMP-EMC... let it be known, Paul Zimmerman is godlike. But this is the first version to have P-1/+1 factoring so it's not as well tuned as you might expect.

George Woltman has suggested that very little needs to be done to his publically available code to make it work for our purposes. i'm trying to port it and so is another group. no luck there yet. maybe someone here can do it.

either package would be a nice place to start. i wouldn't be surprised if someone here wrote their own that was better than both (or at least GMP-EMC).

we'll have to see who steps up to help. it certainly would be fun to have a "P-1 factor coordination" thread. ;)

-Louie

jjjjL
06-10-2003, 10:20 PM
well, this version is at least capable of doing larger numbers.

55459
4000006

produced:

$ ./ecm.exe -pm1 105 1150 < input.txt
GMP-ECM 5.0 [powered by GMP 4.1.2] [P-1]
Input number has 1204127 digits
Using B1=105, B2=1150, polynomial Dickson(4), x0=4059057736
Step 1 took 1202580ms
Step 2 took 2614118ms
********** Factor found in step 2: 18437164305601
Found probable prime factor of 14 digits: 18437164305601

that looks like a 18T factor in 1 hour. it's a lot easier when you know where they are. ;)

this does give me hope that it may be useful for cracking a real unknown factor too. after some optimizations anyway. anyone got a speed comparison yet to gimps yet? what does that take to do the same exp to those depths? probably not an hour.

-Louie

wblipp
06-10-2003, 10:25 PM
Originally posted by jjjjL
it certainly would be fun to have a "P-1 factor coordination" thread. ;)

It's an interesting design question whether P-1 (and possibly P+1) factoring should be done separately, like sieving, or should be integrated into the SoB client to be performed automatically as part of every test.

One advantage of integration is that the testing will always be delayed until the last possible moment, so there is every opportunity for a sieving factor to arrive first.

One advantage of separate tests is that stage 2 of P-1 and P+1 tests work best with large amounts of memory, so we could dedicate big memory machines to this testing.

Another interesting design issue will be determining our optimization points. Our optimization points should be different from GIMPS because they always need a double check and we don't really know how much double checking we will be doing.

jjjjL
06-10-2003, 10:34 PM
Originally posted by wblipp
It's an interesting design question whether P-1 (and possibly P+1) factoring should be done separately, like sieving, or should be integrated into the SoB client to be performed automatically as part of every test.

One advantage of integration is that the testing will always be delayed until the last possible moment, so there is every opportunity for a sieving factor to arrive first.

One advantage of separate tests is that stage 2 of P-1 and P+1 tests work best with large amounts of memory, so we could dedicate big memory machines to this testing.

Another interesting design issue will be determining our optimization points. Our optimization points should be different from GIMPS because they always need a double check and we don't really know how much double checking we will be doing.

i think it will be best to start it out kinda like sieving and then make it a server project. probably seperate but i'm not sure. we'll see.

-Louie

smh
06-11-2003, 06:32 AM
this does give me hope that it may be useful for cracking a real unknown factor too. after some optimizations anyway. anyone got a speed comparison yet to gimps yet?

Don't forget that GIMPS has another advantage. Factors of Mersenne numbers are of the form 2kp+1. Only K has to be smooth, P is a factor that is already known.

jjjjL
06-11-2003, 08:04 AM
Originally posted by smh
Don't forget that GIMPS has another advantage. Factors of Mersenne numbers are of the form 2kp+1. Only K has to be smooth, P is a factor that is already known.

yeah, but they have that advantage in their regular sieveing too so it's pretty much a wash when comparing sieve to P-1 advantage.

however, we shouldn't overlook the fact that while gimps factors are all smoother for P-1 factoring, our factors are more likely just as smooth for P+1 factoring. for each number, one is just as likely as the other to be smooth. that's something gimps doesn't have to think about.

-Louie

smh
06-11-2003, 10:50 AM
yeah, but they have that advantage in their regular sieveing too

GIMPS doesn't sieve, they'll trailfactor each exponent seperately.

But you are right, their factors have special forms so they won't have to try each prime.

jjjjL
06-13-2003, 03:10 PM
new version ready. much more better. let me know how it works for you if you test it.

http://www-personal.engin.umich.edu/~lhelm/sbfactor02.zip

-Louie

wblipp
06-13-2003, 04:45 PM
Originally posted by jjjjL
new version ready. much more better. let me know how it works for you if you test it.

http://www-personal.engin.umich.edu/~lhelm/sbfactor02.zip

-Louie

First thing I noticed is the specifying B1 and B2 using "e" notation doesn't work, although it works fine in GMP-ECM. For example B1 as 5e4 is read as B1=5.

smh
06-13-2003, 05:00 PM
I tried to factor the number you tried above, but it doesn't seem to find a factor.



D:\factor\p_1\sobp_1>sbfactor 55459 4000006 105 1150
SBFactor v0.2
P-1 factoring for number of the form k*2^n+1.
Adapted from GIMPS v23.4 by George Woltman and Louis Helm
P-1 on 55459*2^4000006+1 with B1=105, B2=1150
initializing test
sieve finished
exp is: 718766754945489455304472257065075294400
55459*2^4000006+1 stage 1 is 71.613 complete.
55459*2^4000006+1 stage 1 complete. 312 transforms. Time: 11 seconds
Starting stage 1 GCD - please be patient.
v: 1
retval: 1
v is one: 1
N is v: 1
55459*2^4000006+1 stage 2 is 8.995 complete.
55459*2^4000006+1 stage 2 is 73.876 complete.
55459*2^4000006+1 stage 2 complete. 480 transforms. Time: 95 seconds
Starting stage 2 GCD - please be patient.
v: 1
retval: 1
v is one: 1
N is v: 1
finished.
Total factoring Time: 2 minutes


I'll do some more testing.

edit

And it misses this one too:



D:\factor\p_1\sobp_1>sbfactor 22699 2946070 2000 2500
SBFactor v0.2
P-1 factoring for number of the form k*2^n+1.
Adapted from GIMPS v23.4 by George Woltman and Louis Helm
P-1 on 22699*2^2946070+1 with B1=2000, B2=2500
initializing test
sieve finished
exp is: 151117794877444315307536308337572822173736308853579339903227904473000476
32234723465512216086666894694199395101427093351203019495722137195682884352156808
21737862512423331578304504356232116643085003168444786178091011582206721088950535
08829266120497031742749376045929890296052805527212315382805219353316270742572401
96203546487823570375946479680607513105652007983695577041502131850827298210373665
86333904113477590005632712260621823459641841673469182252438563487940133554184046
95826256911622054015423611375261945905974225257659010379414787547681984112941581
32519839663468565921786120877140032250738816196751371916636683989421404078773347
12878456298339938854134622252945487855816418046204172565636852805865113019183990
10451347815776570842790738545306707750937624267501103840324470083425714138183905
657667736579430274197734179172691637931540695631396056193786415805463680000
22699*2^2946070+1 stage 1 is 3.476 complete.
22699*2^2946070+1 stage 1 is 6.952 complete.
22699*2^2946070+1 stage 1 is 10.428 complete.
22699*2^2946070+1 stage 1 is 13.903 complete.
22699*2^2946070+1 stage 1 is 17.379 complete.
22699*2^2946070+1 stage 1 is 20.855 complete.
22699*2^2946070+1 stage 1 is 24.331 complete.
22699*2^2946070+1 stage 1 is 27.807 complete.
22699*2^2946070+1 stage 1 is 31.283 complete.
22699*2^2946070+1 stage 1 is 34.758 complete.
22699*2^2946070+1 stage 1 is 38.234 complete.
22699*2^2946070+1 stage 1 is 41.710 complete.
22699*2^2946070+1 stage 1 is 45.186 complete.
22699*2^2946070+1 stage 1 is 48.662 complete.
22699*2^2946070+1 stage 1 is 52.138 complete.
22699*2^2946070+1 stage 1 is 55.613 complete.
22699*2^2946070+1 stage 1 is 59.089 complete.
22699*2^2946070+1 stage 1 is 62.565 complete.
22699*2^2946070+1 stage 1 is 66.041 complete.
22699*2^2946070+1 stage 1 is 69.517 complete.
22699*2^2946070+1 stage 1 is 72.993 complete.
22699*2^2946070+1 stage 1 is 76.469 complete.
22699*2^2946070+1 stage 1 is 79.944 complete.
22699*2^2946070+1 stage 1 is 83.420 complete.
22699*2^2946070+1 stage 1 is 86.896 complete.
22699*2^2946070+1 stage 1 is 90.372 complete.
22699*2^2946070+1 stage 1 is 93.848 complete.
22699*2^2946070+1 stage 1 is 97.324 complete.
22699*2^2946070+1 stage 1 complete. 5754 transforms. Time: 144 seconds
Starting stage 1 GCD - please be patient.
v: 1
retval: 1
v is one: 1
N is v: 1
22699*2^2946070+1 stage 2 is 0.600 complete.
22699*2^2946070+1 stage 2 complete. 354 transforms. Time: 175 seconds
Starting stage 2 GCD - please be patient.
v: 1
retval: 1
v is one: 1
N is v: 1
finished.
Total factoring Time: 3 minutes


It does find a factor is stage one though:



D:\factor\p_1\sobp_1>sbfactor 22699 2946070 2500 2500
SBFactor v0.2
P-1 factoring for number of the form k*2^n+1.
Adapted from GIMPS v23.4 by George Woltman and Louis Helm
P-1 on 22699*2^2946070+1 with B1=2500, B2=2500
initializing test
sieve finished
exp is: 106510494520812546426453164388604465285999854308692495682249271089807213
06428206240809505196704302195900220417077765787838448094951793623103176400361198
53473905102295359692079190995380381915412795803937490383602897354051010076575820
40036957623725894088347226025318922894165891041644695093217476255406976946625514
76865848264516186199644592963431936652632231100205244377467834576480916368826217
22033661955258524263306986748419129559530383605063848393796452258387378142362943
35094270131993616702817768239494210813296081888048626028788194625898235424628090
05231685890147798598832045271706506989408597963506302051910937812755233786298809
36380967043746733383246311147580217120504801391362222374380949660215311106601648
94745482137497219769183873441603140255708836274160814471913611742603571743637111
25657042853121002635877156167341476244071806861681527119271280458611022559610530
70145984309206300197750913103095033255960222812276013878589321622761614625953862
90938157830391434752960869263535148732460023317694364498806454958781925302998074
980866120862904826561098689492823559877964496277760000
22699*2^2946070+1 stage 1 is 2.775 complete.
22699*2^2946070+1 stage 1 is 5.549 complete.
22699*2^2946070+1 stage 1 is 8.324 complete.
22699*2^2946070+1 stage 1 is 11.099 complete.
22699*2^2946070+1 stage 1 is 13.873 complete.
22699*2^2946070+1 stage 1 is 16.648 complete.
22699*2^2946070+1 stage 1 is 19.423 complete.
22699*2^2946070+1 stage 1 is 22.198 complete.
22699*2^2946070+1 stage 1 is 24.972 complete.
22699*2^2946070+1 stage 1 is 27.747 complete.
22699*2^2946070+1 stage 1 is 30.522 complete.
22699*2^2946070+1 stage 1 is 33.296 complete.
22699*2^2946070+1 stage 1 is 36.071 complete.
22699*2^2946070+1 stage 1 is 38.846 complete.
22699*2^2946070+1 stage 1 is 41.620 complete.
22699*2^2946070+1 stage 1 is 44.395 complete.
22699*2^2946070+1 stage 1 is 47.170 complete.
22699*2^2946070+1 stage 1 is 49.945 complete.
22699*2^2946070+1 stage 1 is 52.719 complete.
22699*2^2946070+1 stage 1 is 55.494 complete.
22699*2^2946070+1 stage 1 is 58.269 complete.
22699*2^2946070+1 stage 1 is 61.043 complete.
22699*2^2946070+1 stage 1 is 63.818 complete.
22699*2^2946070+1 stage 1 is 66.593 complete.
22699*2^2946070+1 stage 1 is 69.367 complete.
22699*2^2946070+1 stage 1 is 72.142 complete.
22699*2^2946070+1 stage 1 is 74.917 complete.
22699*2^2946070+1 stage 1 is 77.691 complete.
22699*2^2946070+1 stage 1 is 80.466 complete.
22699*2^2946070+1 stage 1 is 83.241 complete.
22699*2^2946070+1 stage 1 is 86.016 complete.
22699*2^2946070+1 stage 1 is 88.790 complete.
22699*2^2946070+1 stage 1 is 91.565 complete.
22699*2^2946070+1 stage 1 is 94.340 complete.
22699*2^2946070+1 stage 1 is 97.114 complete.
22699*2^2946070+1 stage 1 is 99.889 complete.
22699*2^2946070+1 stage 1 complete. 7208 transforms. Time: 188 seconds
Starting stage 1 GCD - please be patient.
v: 1000809717779
retval: 1
v is one: 0
N is v: 1
1000809717779 | 22699*2^2946070+1
Total factoring Time: 3 minutes

jjjjL
06-13-2003, 05:45 PM
I'm seeing inconsistent results too. Better than before when it didn't work at all, but there's still something broke.

For some reason, that first example you tried works if I use B1=105 B2=1300. it however does not work with other values as well.

George Woltman has helped me fix one bug in this version already and is helping me locate the others.

Right now, I'm working on integrating a modified version of the B1/B2 auto-bound setter.

-Louie

jjjjL
06-13-2003, 11:39 PM
nothing incredible but I did find a new factor using P-1

22283121920233 | 4847*2^302511+1

22283121920233 = 2 ^ 3 x 3 x 2161 x 3877 x 110819 + 1

I found it with B1=25000 B2=2500000

This was my batch file I've run for a couple hours

sbfactor.exe 4847 300207 25000 0
sbfactor.exe 4847 300831 25000 0
sbfactor.exe 4847 301143 25000 0
sbfactor.exe 4847 301191 25000 0
sbfactor.exe 4847 301263 25000 0
sbfactor.exe 4847 301503 25000 0
sbfactor.exe 4847 301863 25000 0
sbfactor.exe 4847 302223 25000 0
sbfactor.exe 4847 302511 25000 0
sbfactor.exe 4847 302847 25000 0
sbfactor.exe 4847 302943 25000 0
sbfactor.exe 4847 303087 25000 0
sbfactor.exe 4847 303423 25000 0
sbfactor.exe 4847 303687 25000 0
sbfactor.exe 4847 303711 25000 0
sbfactor.exe 4847 303831 25000 0
sbfactor.exe 4847 303903 25000 0

(B2=0 --> B2=100*B1 by default)

-Louie

rosebud
06-14-2003, 09:00 AM
The program produces one file for each run. Does it reduce the time needed for another run with the same n and k but with different values for b1 and b2? Or can i just delete it?

biwema
06-14-2003, 11:20 AM
Are the new factors writeen into a file or are they only displayed in the screen output?
In that case it is quite difficult not to lose any factor as it is scrolling quite quickly.

biwema

rosebud
06-14-2003, 12:08 PM
The factors are written to the file fact.txt so that you won't lose anything...

smh
06-14-2003, 04:00 PM
22283121920233 = 2 ^ 3 x 3 x 2161 x 3877 x 110819

Of course, 22283121920233 is prime.

P-1 = 2 ^ 3 x 3 x 2161 x 3877 x 110819

Any news on a version that doesn't miss factors on stage two?

jjjjL
06-14-2003, 05:37 PM
Originally posted by rosebud
The program produces one file for each run. Does it reduce the time needed for another run with the same n and k but with different values for b1 and b2? Or can i just delete it?

The file it produces will help with higher values of B1/B2. You should probably delete them when we finish testing the program and it actually works better.

I've fixed a small bug that would have made the client produce incorrect results if there were roundoff errors. The bug that makes the exponent slightly wrong at low values is also fixed. However, neither of those are what is causing it to miss factors. I'll let you know when there's a better version. For now, if you want to test it, try and see if you can find factors to save a secret/supersecret test. Numbers between 510k - 600k shouldn't be as hard to factor. We are about to finish the secret test of k=55459 @ n=540k so maybe you factor some of those and prevent 2 tests around there. You could find numbers by going though SoB.dat and the results.txt file but I'll post some below for those who want them.

-Louie

jjjjL
06-14-2003, 05:38 PM
k
n
primality
status
55459
510250
Unknown
Untested
55459
510934
Unknown
Untested
55459
510958
Unknown
Untested
55459
511186
Unknown
Untested
55459
511246
Unknown
Untested
55459
511606
Unknown
Untested
55459
511678
Unknown
Untested
55459
511690
Unknown
Untested
55459
511834
Unknown
Untested
55459
512014
Unknown
Untested
55459
512074
Unknown
Untested
55459
512218
Unknown
Untested
55459
512866
Unknown
Untested
55459
513310
Unknown
Untested
55459
513418
Unknown
Untested
55459
513514
Unknown
Untested
55459
513586
Unknown
Untested
55459
513766
Unknown
Untested
55459
513850
Unknown
Untested
55459
513994
Unknown
Untested
55459
514174
Unknown
Untested
55459
514498
Unknown
Untested
55459
514570
Unknown
Untested
55459
514678
Unknown
Untested
55459
515134
Unknown
Untested
55459
515290
Unknown
Untested
55459
515470
Unknown
Untested
55459
515494
Unknown
Untested
55459
515638
Unknown
Untested
55459
515746
Unknown
Untested
55459
516694
Unknown
Untested
55459
517114
Unknown
Untested
55459
517234
Unknown
Untested
55459
517306
Unknown
Untested
55459
517366
Unknown
Untested
55459
517618
Unknown
Untested
55459
517774
Unknown
Untested
55459
518314
Unknown
Untested
55459
518566
Unknown
Untested
55459
518914
Unknown
Untested
55459
518998
Unknown
Untested
55459
519070
Unknown
Untested
55459
519646
Unknown
Untested
55459
519886
Unknown
Untested
55459
519994
Unknown
Untested
55459
520066
Unknown
Untested
55459
520246
Unknown
Untested
55459
520258
Unknown
Untested
55459
520426
Unknown
Untested
55459
520474
Unknown
Untested
55459
520534
Unknown
Untested
55459
520678
Unknown
Untested
55459
521218
Unknown
Untested
55459
522046
Unknown
Untested
55459
522130
Unknown
Untested
55459
522298
Unknown
Untested
55459
522310
Unknown
Untested
55459
522418
Unknown
Untested
55459
522694
Unknown
Untested
55459
522766
Unknown
Untested
55459
523138
Unknown
Untested
55459
523306
Unknown
Untested
55459
523354
Unknown
Untested
55459
523534
Unknown
Untested
55459
523846
Unknown
Untested
55459
523930
Unknown
Untested
55459
524098
Unknown
Untested
55459
524686
Unknown
Untested
55459
524818
Unknown
Untested
55459
524854
Unknown
Untested
55459
525118
Unknown
Untested
55459
525178
Unknown
Untested
55459
525334
Unknown
Untested
55459
525370
Unknown
Untested
55459
525514
Unknown
Untested
55459
525658
Unknown
Untested
55459
526558
Unknown
Untested
55459
526618
Unknown
Untested
55459
526798
Unknown
Untested
55459
526810
Unknown
Untested
55459
526846
Unknown
Untested
55459
526954
Unknown
Untested
55459
527170
Unknown
Untested
55459
527314
Unknown
Untested
55459
527350
Unknown
Untested
55459
527890
Unknown
Untested
55459
528238
Unknown
Untested
55459
528346
Unknown
Untested
55459
528634
Unknown
Untested
55459
528814
Unknown
Untested
55459
529114
Unknown
Untested
55459
529150
Unknown
Untested
55459
529438
Unknown
Untested
55459
529798
Unknown
Untested
55459
530194
Unknown
Untested
55459
530338
Unknown
Untested
55459
530578
Unknown
Untested
55459
530590
Unknown
Untested
55459
530866
Unknown
Untested
55459
531154
Unknown
Untested
55459
531634
Unknown
Untested
55459
531658
Unknown
Untested
55459
531766
Unknown
Untested
55459
531874
Unknown
Untested
55459
532054
Unknown
Untested
55459
532210
Unknown
Untested
55459
532234
Unknown
Untested
55459
532414
Unknown
Untested
55459
532486
Unknown
Untested
55459
532774
Unknown
Untested
55459
532918
Unknown
Untested
55459
533578
Unknown
Untested
55459
533650
Unknown
Untested
55459
534010
Unknown
Untested
55459
534154
Unknown
Untested
55459
534226
Unknown
Untested
55459
534334
Unknown
Untested
55459
534358
Unknown
Untested
55459
534394
Unknown
Untested
55459
534898
Unknown
Untested
55459
535018
Unknown
Untested
55459
535414
Unknown
Untested
55459
535438
Unknown
Untested
55459
535918
Unknown
Untested
55459
536206
Unknown
Untested
55459
536374
Unknown
Untested
55459
536446
Unknown
Untested
55459
536518
Unknown
Untested
55459
536530
Unknown
Untested
55459
537094
Unknown
Untested
55459
537274
Unknown
Untested
55459
538138
Unknown
Untested
55459
538426
Unknown
Untested
55459
538438
Unknown
Untested
55459
538606
Unknown
Untested
55459
538726
Unknown
Untested
55459
539014
Unknown
Untested
55459
539050
Unknown
Untested
55459
539146
Unknown
Untested
55459
539158
Unknown
Untested
55459
539410
Unknown
Untested
55459
539878
Unknown
Untested

jjjjL
06-14-2003, 06:29 PM
Just got an email from George. He found the major bug and emailed me a fix. A new version is posted below that should find all factors correctly. Test it out and post results. You shouldn't be able to make it miss known factors now. Incidently, George also noticed that I hadn't yet enabled CPU specific optimizations. So this version not only works, it works a lot faster. :)

http://www-personal.engin.umich.edu/~lhelm/sbfactor05.zip

-Louie

jjjjL
06-15-2003, 10:46 AM
1590402916048273 | 33661*2^510216+1

1.5 peta-digit-factor! nice :) This factor is almost too large to submit. I may have to allow >2^64 factors soon. If it had one more large factor, I would have had to already.

1590402916048273 = 2 ^ 4 x 3 x 23 x 43 x 197 x 1381 x 123143 + 1

I found this with B1=50000 B2=5000000 (both a little overkill).

This is about as smooth of factor as you get. 2^4!? That's beging to be uncovered with P-1.

Lets compare to my last factor:

22283121920233 | 4847*2^302511+1

Other than for testing the code, the last batch of P-1 tests I ran around n=300k were hardly worth it. It only saved us one super secret test that would have taken 20 minutes and to do that, I ran factoring for about 4 hours. BUT the new factor was found for a completely untested number that was bound to get tested twice (by secret/supersecret). That number would have taken closer to 1-2hours to test and I found it with a batch file that ran for around 30 minutes. :)

Now that it works and it's fast, P-1 factoring makes sense. It's not a replacement for sieving but it makes a good complement to it. When a number can't be sieved out and is about to be tested, it pays off to spend a little time trying to factor it.

As I see it, there are two immediate areas that should be factored. n=510,000 and n=4,000,000.

The lower one is good because we're about to test n=510k numbers with secret AND supersecret very soon. You can save two tests by finding a factor now. It doesn't take too long run P-1 factoring at those levels either. I'd say that's a good area to experiment with the new version of the program until we come up with a good strategy for using this program.

n=4M is good too. I'm going to try around there soon too. Those numbers are just as likely to have factors and only take a little more time to factor. But instead of saving a few minutes of secret/supersecret doublechecking, you save 3-5days of testing.

I can think of three things I'd like to see people do:

1) Help me figure out a good allocation strategy for P-1 factoring. (wblipp, I'm looking at you ;))

2) Someone should make a script/webpage/something that reconciles the SoB.dat and results.txt file to make a list of unfactored numbers. (MikeH, I'm looking at you ;))

3) Test the new SBFactor. It works now and is fast. All the low hanging P-1 fruit is still in the orchard for those who want to pick it. (smh/biwema/rosebud, I'm looking at you ;))

-Louie

ceselb
06-15-2003, 01:45 PM
I'd love to try this at n=4,000,000, if someone could generate a file. Is the code optimized for the P4?

smh
06-15-2003, 02:19 PM
As I see it, there are two immediate areas that should be factored. n=510,000 and n=4,000,000.

Numbers around N=500K already had a lot of sieving done in comparision to the time it takes to prp them. If you spend the optimal time sieving these low numbers you only have the time to sieve with low bounds, which makes the chances of finding a factor above the current sieve depth low.

For numbers around N=4M this is completely different. The optimal time to prp should be something like the time it takes to prp a number twice (and add a little bit of time for faulty tests which need a tripple check) divided by the chance to find a factor with P-1 using the optimal bounds. (this doesn't take into account the sieving time on each number.

This makes me wonder, does anyone have an idea of how many giga hertz days/weeks have been put into sieving? I wonder how much time this is per exponent.


3) Test the new SBFactor. It works now and is fast. All the low hanging P-1 fruit is still in the orchard for those who want to pick it. (smh/biwema/rosebud, I'm looking at you )

At least it finds the two factors it missed with the old version.

I'll plan more p-1 when there is a lists with numbers available.



1590402916048273 = 2 ^ 4 x 3 x 23 x 43 x 197 x 1381 x 123143 + 1

I found this with B1=50000 B2=5000000 (both a little overkill).

This is about as smooth of factor as you get. 2^4!? That's beging to be uncovered with P-1.


2^4 is only 16, so it hardly makes any difference. A few above 1000 and a large one in the second stage is what gives you the large factors.

Moo_the_cow
06-15-2003, 02:48 PM
One major concern about P-1 factoring:

- Mike H's sieve stats could be thrown off A LOT by P-1 factoring.
Example: I've sieved about 235 G so far, and my score is about
5,000. A 1GHZ PIII would need at least 12 days of computing
time (provided that the PC is on 24 hrs a day) to get to that
score. However, it would take the same computer only a few
hours to get to that score (by finding a 5 peta-digit factor for
just 1 number). This would encourage almost all sievers to
switch to P-1 factoring, and that is bad since candidates can
be eliminated twice as fast by regular sieving than by P-1
factoring.

Until someone resolves that problem,

:machgun: P-1 factoring

or I'll :trash: in protest.

Mystwalker
06-15-2003, 03:36 PM
Hm, a stats munchkin. :bonk: ;)

Well, I'd as say the approach factoring vs. sieving is quite different, there should be a seperate statistic.
I don't know the effort-relating factors (pun not intended) for the factoring, but they are different from those of sieving. So the effort estimation shouldn't be done on the basis of the factors.

Ok, that's basically the same as Moo said... :spank: <-- *selfslap* :p

Well, I thing this is something to be discussed on the way, but it should be no obstacle for factoring.
At least as long as it doesn't stop me from climbing the stats ladder. :D

mklasson
06-15-2003, 03:36 PM
Let's say I tried to factor a k/n pair I was just assigned and actually found a factor. What would be a good thing to do next? Report the factor and delete SB's z* file to force it to fetch a new pair?

Mikael

eatmadustch
06-15-2003, 04:17 PM
i think deleting the z file won't work, you'll just start the block from the beginning again. Change your Username, then change it back again (clicking on "yes" if you want to clear the cache). That way you'll get a new test. Or I suppose you can click on expire on the "pending test managment" page!

mklasson
06-15-2003, 04:20 PM
Originally posted by eatmadustch
Or I suppose you can click on expire on the "pending test managment" page!
Doh! :) That's right, we've got that fancy feature nowadays...

Mystwalker
06-15-2003, 04:36 PM
Or I suppose you can click on expire on the "pending test managment" page!

But AFAIK this won't stop the client until it tries to report a new block. I suggest clearing the cache, too.

jjjjL
06-15-2003, 05:05 PM
Originally posted by mklasson
Let's say I tried to factor a k/n pair I was just assigned and actually found a factor. What would be a good thing to do next?

That's a good idea. If you did that, you'd want to:

1) Submit the factor at http://www.seventeenorbust.com/sieve/

2) Expire the test at http://www.seventeenorbust.com/account/testsPending.mhtml

3) Clear your cache and retrieve a new # by modifying your username and saving settings ( to prompt the cache clear question). Click yes, then change your username back.

You'd want to do the first 2 steps in that order to avoid having the system reassign your test. You wouldn't have much time either. The server hands out the lowest number so if you just expired a test, it is mostly going to be the next in the queue. I'd guess that the average time from pressing expire to it being reassigned is 50 seconds.

-Louie

Keroberts1
06-15-2003, 05:51 PM
I'm a little un sure what P-1 factoring is but from watching this thread I've noticed that it appears it can save alot of work. What is the program that does this? Can it be incorporated with the prp client? How can I use it to help me? Just a few questions for all of us uninformed participants who can't figure out more than half of what is discussed in here.

ceselb
06-15-2003, 06:00 PM
Is there a plain text list of all the factors to be found somewhere?
I can't do much with the sob.dat file. I think there was a file like that distibuted with one of the earlier versions of sobsieve, but I deleted it months ago.

Joe O
06-15-2003, 06:11 PM
Ceselb,
What you want is the SoB.txt file included here (http://www.aooq73.dsl.pipex.com/SobDat_n300K-20M.zip)

This will have all the n for each of the 12 k. They are "in the clear" unlike the sob.dat file that has only the delta to save space.

MikeH
06-15-2003, 06:32 PM
What you want is the SoB.txt file included here This is clearly going to become very useful, so unless Louie gets there before me, I'll have a look at posting a daily updated version of this file (tomorrow).

MikeH
06-15-2003, 06:42 PM
One major concern about P-1 factoring:

- Mike H's sieve stats could be thrown off A LOT by P-1 factoring.: I'd like to see how this pans out first, but if it does throw the stats out a long way, I have one easy method of sorting it. I could make it so that all factors with p significantly above the current sieveing range would score as if they were at the top of the current sieving range. Of course this means that anyone that has sieved high would see their score fall over night, but then it would slowly increase again. But I think this would be fair.

expinete
06-15-2003, 06:52 PM
If I had understood what does P-1 factoring do, I think it could save a lot of work if it is integrated into the SB client.

When someone starts working on a new k/n pair, the SB client executes the P-1 and if it finds a factor in 1 or 2 hours, there is no need to run a prp test that could take about 5 days.

What do you thing about this?

Moo_the_cow
06-15-2003, 07:06 PM
quote:
___________________________________________
What do you thing about this?
___________________________________________

I don't think that P-1 factoring is good at all.
- P-1 factoring uses an awful lot of memory
- It's hard to intelligently choose B1 and B2 on P-1 factoring,
and most users will not like the manual work
- Since the candidates have already been sieved to a pretty good
depth, there is a big chance that the remaining candidates
who are composite don't have a factor small enough to be
discovered with P-1 factoring
- Even for the candidates which have a factor small enough to be
discovered with P-1 factoring, the amount of time needed to
eliminate that number could have been used to eliminate 2
candidates by sieving
- The P-1 factoring software is not well optimized and has a lot
of bugs in it, and I think that it would be quite some time before
the software is fully tested.

The only project that does P-1 factoring is GIPMS, and they only
do that because they can't sieve.

Joe O
06-15-2003, 07:28 PM
MikeH,
AFAIK your sieve scoring is set to reflect the "avoided cost" of PRPing that n. If this is so, then it does not matter how we avoid PRPing that n. Sieving, P-1, P+1, ECM, etc. All of these have been used in the past. Look at the old thread(s). So there will now be more P-1 ing. So What! Big Deal!
A PRP avoided, is a PRP avoided.

biwema
06-15-2003, 07:29 PM
Hi,

I think, the p-1 factroing will mix up the stats quite a bit. The new factors are distributed between 1T and beyond 1E.

On the other hand, the update of the scoring algorithm as suggested by Mike, is not optimal either. I did quite a bit sieving (about 750G or so) for testing purposes at 281T and 1126T and will lose more than 7500 points on these factors.

I suggest changing the scoring algorithm in that way, that big factors are scoring, if the reserved or completed range of that user around that factor has that size that the expected number of factors in there is at least 0.5 or so. In that case these factors are fully scoring whose area is sieved by that user (that equals completed work).

i know it's difficult

biwema

jjjjL
06-15-2003, 07:42 PM
Originally posted by Keroberts1
I'm a little un sure what P-1 factoring is but from watching this thread I've noticed that it appears it can save alot of work. What is the program that does this? Can it be incorporated with the prp client? How can I use it to help me? Just a few questions for all of us uninformed participants who can't figure out more than half of what is discussed in here.

A quick explaination of P-1 factoring is presented @ http://www.mersenne.org/math.htm

Right now, our sieve is finding numbers pretty quickly, but the factor density is dropping fast. My computer may go 160,000p/s but our factor density is down to less than 1 factor / G (1,000,000,000p). Last I calculated, it was around 0.8. That means it takes an average of 2.2 hours to find a single factor. There's also about a 15% chance the factor is for an already factored number. Opps, closer to 2.6 hours now for each useful factor. This gets worse and worse each day we factor. When we hit 50T, factor density will fall to around 0.4 ==> 5 hours / factor. I don't suggest we stop any time soon, but the proverbial long hanging fruit is very much gone. Incidently, the thing that has really maintained the sieve speed has a lot to do with Jobling's Law(tm) which states that Paul Jobling will release a new version of SoBSieve that's 2x faster every 18 days. ;) I've always hoped someone would do a study on the siever and graph speed increases. If no one does soon, I'll probably make my own soon.

P-1 factoring is a much more targeted approach to finding factors. It is also not dependent on the factor density like sieving. It's dependent on how "smooth" the factor -1 is. Not all the numbers will have a smooth factor, but those that do will give us VERY large factors rather quickly. That 1500 trillion sized factor I found won't be found by the regular sieve for a long long time. By the time regular sieving would have found it, the number would probably have been double checked and a factor would be of little use.

So that's the motivation and the explaintion.

As far as "what does it all mean for the project as a whole?" part, it's hard to say right now. GIMPS finds factors for about 3.2% of the numbers it tests by P-1 factoring (i think). It also takes them 2.2% of the time it would have taken to just test the numbers. Net gain of 1% in throughput. What should we expect? There is reason to believe it may be a little more but probably not a whole lot.

[FYI - The reason to believe goes something like:

GIMPS is able to trial factor numbers MUCH higher than us due to the special form of their factors. Something like 2^64. We're in 2^44 land. That difference is kinda huge. I'm still trying to decided if the difference all cancels out since they know that their exponent is a factor in P-1 factoring. It may completely cancel out or it may at least partially. I'll let you know if I figure that out.]

As for Moo, why you be hating on P-1 so much? ;)

- It's hard to intelligently choose B1 and B2 on P-1 factoring,
and most users will not like the manual work
--> there's nothing saying users will have to. i'm working on making the program pick B1/B2 intelligently for you based on how likely is it will find factors. it may eventually be in the client so people can just check a box and do it if they want.

- Since the candidates have already been sieved to a pretty good
depth, there is a big chance that the remaining candidates
who are composite don't have a factor small enough to be
discovered with P-1 factoring
--> we just need one smooth factor that is larger than the current sieve depth. a lot of our numbers have these just waiting to be uncovered.
- Even for the candidates which have a factor small enough to be
discovered with P-1 factoring, the amount of time needed to
eliminate that number could have been used to eliminate 2
candidates by sieving
--> maybe for another month, but sieving is slowing down awfully fast. and those 2 theoretical factors you could have found sieving probably eliminated the need to do a test in range much higher than we're testing in. so those 2 factors will only truely be useful in a couple years. P-1 factoring can find factors for numbers that we're doing right now. even if you're pessimestic about it, at least half of the factors found sieving 3M-20M will not be needed for years and will not speed up how quickly we find the next prime. more realistically, 75-85% of the factors sieved out right now don't speed up the project "right now." i'm all for thinking long term, but we can't loose sight of shorter term goals.
- The P-1 factoring software is not well optimized and has a lot
of bugs in it, and I think that it would be quite some time before
the software is fully tested.
--> SBFactor v0.5 is fully optimized and has no bugs. I think 3 days of development time is pretty darn good.


-Louie

Keroberts1
06-15-2003, 08:04 PM
what program do we use? where do we get it?

Nuri
06-15-2003, 08:06 PM
To avoid duplication of work:

I am trying the program for k=21181 and 510k < n < 520k (with B1=25000 B2=B1*100).

Joe O
06-15-2003, 08:12 PM
Originally posted by Keroberts1
what program do we use? where do we get it? This program, here. (http://www-personal.engin.umich.edu/~lhelm/sbfactor05.zip)
Usage:
sbfactor.exe k n 25000 0

jjjjL
06-15-2003, 08:21 PM
Originally posted by MikeH
I'd like to see how this pans out first, but if it does throw the stats out a long way, I have one easy method of sorting it. I could make it so that all factors with p significantly above the current sieveing range would score as if they were at the top of the current sieving range. Of course this means that anyone that has sieved high would see their score fall over night, but then it would slowly increase again. But I think this would be fair.

I think I agree with Joe on this one. I don't see the point in handicaping sieving or P-1 factoring or whatever. Ideally, factoring scores would be a similar to a "how much time did this save the project" kinda thing. realistically, this isn't done because otherwise I'd be credited with 50% of the "work saved" since I was the one who factored the first 25G (where a lot of the factors were). your current scoring does a good job of adjusting the size of the factors to match how hard they were to find. if someone can find a huge factor using P-1, I don't see why it should be discounted.

reguardless of how you deal with P-1 factoring, I have a couple other sugestions.

I could add a column to the results.txt to show how many tests have been completed for each number. so a 0-factor (untested) should probably get a boost over a 1-factor (tested once) which should get a boost over a 2-factor (tested twice). this would encourage people to basically "do what helps the project most", especially with P-1 factoring (people won't factor small, already double checked numbers and will try and factor completely untested numbers). I know you're a proponent of the DC sieve and a large contributer there so it may seem cruel to discount that work by a faction but it may be more fair. just an idea. i don't expect you to discount them by half, but maybe by like 25% for one tested and 80% for factors of double-tested numbers.

also, last I saw your stats update at 10AM UK time. well I know I posted it somewhere, but I just wanted to make sure that you're aware: the result.txt file is dumped at 3:40AM EST. I'm guessing you're GMT and since I'm -5 that makes it an hour and 20 minutes between when the file is updated and your stats get updated. You could make it a little fresher by updating at 9AM. In fact, if you'd like, I could make it update more often or at different times. It only takes 2 seconds to run so it's not too big a deal to do it a few times a day. I made it 3:40AM back when no one was using it. Now that there are users, that kinda thing may matter more. My only real concern is that it will take too much bandwidth. I'll probably compress it and increase the portion stored off the server in lowresults.zip if you want to download it all the time. gzip files ok?

-Louie

Joe O
06-15-2003, 08:22 PM
I've already run
sbfactor.exe 55459 539878 25000 0 with no factor found. i.e k is 55459 and n is 539878
I'm in the process of running 432 < n < 2998 for various k. I've found factors, but these are already known factors. My list of k n pairs had not excluded them or I would not have run them. But at least it is more proof that the program works, and works well.

Joe O
06-15-2003, 08:26 PM
O yes, I'm also running
sbfactor.exe 55459 539410 25000 0
sbfactor.exe 55459 539158 25000 0
sbfactor.exe 55459 539146 25000 0
sbfactor.exe 55459 539050 25000 0
sbfactor.exe 55459 539014 25000 0 from Louie's list. I'll post as soon as I'm done.

Moo_the_cow
06-15-2003, 09:02 PM
OK, here is my stats proposal:
There should be 3 different stats:
1 for regular sieving
1 for DC sieving
and 1 for P-1 factoring.
For the regular sieving and DC sieving, scores should be based
on the number of ranges you have completed, regardless of
how low/high they are. To discourage people from deliberately
witholding factors, each completed range must have at least
70% of the expected number of factors, and if it doesn't, the
range won't count unless a double check of that range shows that
it is simply an unlucky range.
For the P-1 factoring stats, scores should be based on the number
of factors you find, no matter how big they are.

Moo_the_cow
06-15-2003, 09:09 PM
quote:
_________________________________________________
Incidently, the thing that has really maintained the sieve speed has a lot to do with Jobling's Law(tm) which states that Paul Jobling will release a new version of SoBSieve that's 2x faster every 18 days.
_________________________________________________

I'll revise that:
Jobling's Law states that Paul Jobling will release a new version
of SoBSieve that's 2x faster in 37 days.
SoBSieve v.1.06 was running at 8000 k/p/sec on my PC
on January 24th, and 150 days later (May 24th) SoBSieve
v.1.34 is running at 120000 k/p/sec on my PC, an improvement
of 15x or about 2^4.

ceselb
06-15-2003, 09:12 PM
I've compiled a list from this (http://www.aooq73.dsl.pipex.com/SobDat_n300K-20M.zip) file. It contains all 4m < n < 4.4m, by k. Get it here (www.visit.se/~ceselb/4-4.4m.zip)

I have not checked if they have a factor, you have to do that yourself against this (http://www.aooq73.dsl.pipex.com/results_no_duplicates.zip) file.

Hope this helps and keeps everyone busy until we get something better. :cheers:

Nuri
06-15-2003, 10:19 PM
Thanks ceselb. You've done a good job.

In fact, what I'd suggest is something like this.

I agree that we should focus on where our computing power makes the most contribution to the project.

As far as the coordination is concerned, how about creating a text file for each k seperately that has only the untested candidates (and those with no known factors of course) including ~14000 remaining for secret, and update those on a daily basis (also taking into account the addidtional factors submitted through sieve).

Better than that, we might consider two enhancements:

- Not including all untested numbers, but including those that are farther away than one week for their prp testing to start. (which would roughly be max n + 100K for n>4m, and secret max n + 10K? for secret)
- Also, excluding the numbers that are reserved (meaning being worked on, hopefully) for P-1 tests.

If everybody uses these files when reserving their P-1 test ranges, this way, I hope, we can:
- Test numbers that we need to test most, and
- Avoid duplication of effort.

Any other ideas?

Keroberts1
06-16-2003, 12:02 AM
how do i use it do i need to instal it someplace special?

wblipp
06-16-2003, 12:46 AM
Originally posted by jjjjL
1) Help me figure out a good allocation strategy for P-1 factoring. (wblipp, I'm looking at you ;))

-Louie

There is an easy question and a hard question. The easy question is which numbers to test - I think it's clear that we want to stay a short ways ahead of the prp testing - far enough ahead that we are pretty sure P-1 will complete before the handout, but near enough that sieving still has a chance of getting in to save the P-1 test.

The hard question is how to set the B1 and B2 values. I think we need to use the Dickman functions to estimate the probability of smooth factors, and then we need to know how much work it is to test as a function of B1 and B2.

jjjjL
06-16-2003, 12:56 AM
you may want to grab a copy for the GIMPS source and check out the guess_pminus1_bounds() function in commonc.c. It's actually compiled into the current version but it doesn't yet work (it always returns 0,0). Some modification of this is probably what we want.

http://www.mersenne.org/gimps/source23.zip

-Louie

jjjjL
06-16-2003, 01:07 AM
Originally posted by Joe O
I'm in the process of running 432 < n < 2998 for various k.

The other day when I updated the factor verifier, I changed the lowerbound for n from 1 to 1000. The new bounds are all on the sieve page. Anyway, it may not be that those factors are known, it may be that they are simply being rejected by the verifier.

I'll update it to allow you to submit factors again soon.

-Louie

Keroberts1
06-16-2003, 02:06 AM
how do i get the program running is there a dat. file i need ? a special directory to install it to. I don't know why but when i open the file that was posted in response ot my question earlier it simply flashed on the screen and dissapeared

jjjjL
06-16-2003, 03:55 AM
NEW version of SBFactor ready. :)

http://www-personal.engin.umich.edu/~lhelm/sbfactor06.zip

v0.6 now has experimental (but working) ECM factoring support. usage is as follows:

sbfactor.exe k n B1 B2 curves ecm

such as

sbfactor.exe 33661 4000848 1000 0 1 ecm

this means I want B1 to equal 1000, B2 to be 100x larger than B1, I want to run only 1 curve and then I add the keyword ecm at the end to make the program use ECM mode.

ECM factoring is more difficult to understand than P-1. Here's a quick explanation from the GMP-ECM readme:


The P-1 method works well when the input number has a prime factor P such that P-1 is "smooth", i.e. has all its prime factor less or equal the step 1 bound B1, except one which may be less or equal the second step bound B2. For P=67872792749091946529, we have P-1 = 2^5 * 11 * 17 * 19 *43 * 149 * 8467 * 11004397, so this factor will be found as long as B1 >= 8467 and B2 >= 11004397.

There is no need to run P-1 several times with the same B1 and B2, like for ECM, since a factor found with one seed will be found by another one.

The ECM method is a probabilistic method, and can be viewed in some sense as a generalization of the P-1 and P+1 method, where we only require that P+t is smooth, with t random of order P^(1/2). The optimal B1 and B2 bounds have to be chosen according to the (usually unknown) size of P.

Anyway, expect the program to run slower when doing ECM vs P-1 factoring at similar B1/B2 levels. There is also a probablistic nature to how it works, so a factor may not always be found on successive runs like with P-1.

A couple things to be aware of:

The "Normalizing Brent value" step in the begining can take a minute or two for large numbers. It will appear to immediately hang but it will get though it. This is because it's doing a brute force modular inversion of a very large number.

Also, there is no file saving yet. There will be soon.

I fixed a tiny bug in the P-1 code the caused the total time to be reset after saves, making it totally wrong. Nothing serious but it should work correctly now.

Let me know if you test out the new version. I'm pretty sure it's solid. Most people probably shouldn't deal with this feature but it's ok to have it for those of you that know how to use it.

-Louie

Joe O
06-16-2003, 09:31 AM
Originally posted by Keroberts1
how do i get the program running is there a dat. file i need ? a special directory to install it to. I don't know why but when i open the file that was posted in response ot my question earlier it simply flashed on the screen and dissapeared

Pick/create a directory
Put the program in that directory
pick a k value, one of the 12
pick an n value (or more than one) for the k value you chose
create a something.bat file in your directory with one or more lines of the follwing form:
sbfactor.exe k n 25000 0
run the bat file
look for a fact.txt file in that directory - if there is one submit the lines the same way you would the results of a sieving run
pick another k - or the same one
pick some more n
create a s.bat file or modify the something.bat file
run it
look for fact.txt and submit new results
...

smh
06-16-2003, 10:22 AM
To find most 15 digit factors you should run about 25 curves with B1=2000 (ECM). I haven't tested how long each curve would take though.

Louie:
Can you make the next version run at idle priority or at least give it a command line option to do so? And also an option on how often to update the screen?

I think the biggest problem with manual P-1 is the coordination. So many numbers so it will be hard to keep track whcih numbers are done and with what bounds.

It would be great if a future SoB client would have an option to P-1 before the prp test starts.

If it takes a pc 5 days to prp a number and the chance of finding a factor is 3% (using optimal bounds) it's worth to spend

(2 * 5 * 24 * 1.02) * 0.03 ~ 7 hours on P-1 so the currentely used B1=25000 is actually to low (although i'm trying a few numbers with B1=15000 right now).

edit
Just thought of one more thing:

Can you let the program create a resuls file with all numbers tested (to what bounds and factors found)?

Nuri
06-16-2003, 12:51 PM
Originally posted by Joe O

create a something.bat file in your directory with one or more lines of the follwing form:
sbfactor.exe k n 25000 0


Keroberts1,

In case you have problems creating one, here's a sample *.bat file for the remaining candidates of k=4847 for 4,000,000 < n < 4,001,000.

The lines in it contain;

sbfactor.exe 4847 4000167 50000 0
sbfactor.exe 4847 4000287 50000 0
sbfactor.exe 4847 4000503 50000 0
sbfactor.exe 4847 4000527 50000 0
sbfactor.exe 4847 4000623 50000 0
sbfactor.exe 4847 4000743 50000 0

Just unzip it to the same folder as the P-1 client, modify as you like, and run it.

EDIT: Forgot to mention, you can use notepad, wordpad etc. to view, edit and save. If you are using notepad, just save it by adding .bat at the end of the filename you select.

Nuri
06-16-2003, 01:21 PM
Originally posted by Nuri
To avoid duplication of work:

I am trying the program for k=21181 and 510k < n < 520k (with B1=25000 B2=B1*100).

Progress report:

28 tests finished out of 43 in 17 hours, and found only one factor so far. That refers to 36 minutes per test on the average.

PS: Machine used is PIV-1700, 7/24, 99% of CPU time. Version used is 5.

Do you think those figures are normal? (both the speed and number of factors found per test).

If the figures are normal and, if I am not particulary unlucky finding only 1 factor within 28 tests, then it is far more beneficial to prp test instead of P-1 at n=510k level.


Just a quick comparison:

It takes 75 minutes to prp test a candidate at 510k. Taking a conservative view and assuming that in 10% of the tests the residues of secret and supersecret do not match, it takes 2.1 tests to completely get rid of a candidate. => 75 * 2.1 = 157.5 minutes. (or ~9 candidates per day).

I should have been 6 times more luckier to meet the prp effectiveness at 510k. :(

I am sure thinks are different at higher n values.

Any other benchmarks?

BTW: The factor found is 4965673550983 | 21181*2^510164+1. Nothing impressive.

jjjjL
06-16-2003, 02:03 PM
Originally posted by Nuri
I should have been 6 times more luckier to meet the prp effectiveness at 510k. :(

I am sure things are different at higher n values.

I suggested that people might want to try around n=510k because they can be factored quicker than larger numbers. It's a good way to get used to the new program while it's still so user-unfriendly. I agree that it probably doesn't make sense to systematicly test all numbers around 510k.

I'm gonna work on fixing the auto-bound setter. When I get that working, it will be much simpler for coordination purposes. We won't have to track the bounds used for each person on each number. People will just reserve all the n values in a short range and test them to the optimally chosen bounds.

After that I'll program in a .dat file reader so that instead of processing each number seperately though a klunky batch file, you'll just do like

sbfactor.exe SoB.dat 511000 511500 -optimal

as long as you have a new SoB.dat file, you won't have to sort though the text files and check for new factors. or maybe a SoB.dat / result.txt parser so that you can just get a new result.txt file instead of having to use it to update a .dat file.

Anyone find any interesting factors using ECM yet? I have a bunch of random ones I found during testing but nothing too impressive:

17311 | 3*2^4000+1
7 | 3*2^4000+1
7 | 3*2^4000+1
19089 | 4847*2^1000+1
670260543 | 4847*2^3000+1
303 | 4847*2^5000+1
692961 | 4847*2^6000+1
1933885607192733 | 4847*2^6600+1
14361 | 4847*2^6660+1
285261 | 4847*2^6666+1
2879443 | 4847*2^6667+1
189 | 4847*2^6670+1
189 | 4847*2^6670+1
55243523 | 4847*2^6675+1
35376326783 | 4847*2^667+1
189 | 4847*2^6670+1
10868360769 | 4847*2^17680+1
89523 | 4847*2^17680+1
11572653225845 | 4847*2^17685+1
3 | 22699*2^9607+1
21 | 22699*2^19607+1
127049127 | 22699*2^39607+1
69 | 22699*2^69607+1
3 | 22699*2^99607+1
442191729 | 22699*2^199607+1
3 | 55459*2^52537+1

-Louie

Nuri
06-16-2003, 03:14 PM
Anyone find any interesting factors using ECM yet?

Tested 30 of 43 numbers with version 0.5.

Trying remaining 13 with version 0.6.

I'll post results when done.

EDIT: Used sbfactor.exe k n 1000 0 1 ecm for the remaining numbers. It took 6.5 minutes per test. No new factors. So, 1 in 43.

PS: Using 25000 instead of 1000 together with ecm would probably make each test last ~4 hours.

smh
06-16-2003, 03:22 PM
If the figures are normal and, if I am not particulary unlucky finding only 1 factor within 28 tests, then it is far more beneficial to prp test instead of P-1 at n=510k level.

I don't think you're really unlucky. If you find 2 factors on the 43 tests you're finding factors for about 5% of the numbers.

It's definately not worth trying P-1 on 500K numbers with these high bounds. Thats what the optimal bounds are for.

You might have been luckier trying more numbers with lower bounds in the same time, but i guess even sieving is already past the optimal depth for numbers around 500K so most easy factors have aready been found.

Trying P-1 around 4M is a different story though. Each test might take longer to perform but if you still find factors for 3-5% of the candidates overall throughput will increase (especially if you take double checking into consideration).

MikeH
06-16-2003, 03:38 PM
I think I agree with Joe on this one. I don't see the point in handicapping sieving or P-1 factoring or whatever. Ideally, factoring scores would be a similar to a "how much time did this save the project" kinda thing.... I agree totally with what you are saying here, I just need to get the implementation sorted. As you'll see from today's stats, the current scoring doesn't work - biwema jumps straight to the top.

The current scoring is really a "even though you get less factors the further we go, you still score the same as a similar sized lower range". Thus score is directionally proportional to computing effort (or there abouts). What I'd like to do is the same for P-1 factoring - when we have some data for the relative effort expended per factor, then we can follow the same line as the current sieve focused scoring. I like biwema's idea for identifying factors not from sieving.


I could add a column to the results.txt to show how many tests have been completed for each number. so a 0-factor (untested) should probably get a boost over a 1-factor (tested once) which should get a boost over a 2-factor (tested twice). This would be really useful. I tried to do something like this early on, but not having access to enough data meant I gave it up and just made things simpler, but less focused on finding factors for untested candidates.


also, last I saw your stats update at 10AM UK time. well I know I posted it somewhere, but I just wanted to make sure that you're aware: the result.txt file is dumped at 3:40AM EST. I'm guessing you're GMT and since I'm -5 that makes it an hour and 20 minutes between when the file is updated and your stats get updated. You could make it a little fresher by updating at 9AM. In fact, if you'd like, I could make it update more often or at different times. It only takes 2 seconds to run so it's not too big a deal to do it a few times a day. I made it 3:40AM back when no one was using it. Now that there are users, that kinda thing may matter more. My only real concern is that it will take too much bandwidth. I'll probably compress it and increase the portion stored off the server in lowresults.zip if you want to download it all the time. gzip files ok? With EST do you operate day light saving? That 10AM is GMT+1, so it's really 9AM GMT. If it's easy, how about updating 4x per day - I'll need to think about my side, but at least it will all be in place when I'm ready. How about moving the lowresults.zip up to 3T, that will significantly shrink the results.txt file. gzip files are OK with me.

Mike.

jjjjL
06-16-2003, 04:42 PM
Mike,

I updated the factor result files. I think I got all our wishes covered. I added the NumTests column to the end of every factor line. I also have the script setup to gzip the result file a couple minutes after it makes it. The updates are now at 3:40/9:40 AM/PM EST. So you can have your stats update at every 10 and 4 your time. File is here:

http://www.seventeenorbust.com/sieve/results.txt.gz

I also raised the lower limit from 1T --> 3T per your suggestion. This plus the gziping brings the filesize down from over 3MB to 700k even though there is a new column. :)

The new archive of factors for p < 3T is here:

http://www-personal.engin.umich.edu/~lhelm/lowresults.txt.gz

I look forward to seeing how you use the new info. Any other data you can think of that would help with scoring?

-Louie

garo
06-16-2003, 06:47 PM
961094450858074349 | 67607*2^4022171+1


Just wanted to report a real big useful factor!!

biwema
06-16-2003, 07:00 PM
Hi,

I tried p-1 on the mantisse 21181 between 350000 and 390000 until now.
There were 142 candidates and I found 7 factors with the bounds 100000 and 1000000.

I think (just a feeling), that it is not optimal to set B2 100 times bigger than B1. It takes too much time and the benefit is too small. I recommend that B2 is about 10 to 25 times bigger than B1.
With a B1 of around 100000 (a bit more than the suggested 25000) and a B2 of 1 to 2.5 million the efficiency might be a bit better.

My 7 factors look like this:
#1 374P, b1=15000, b2=90000
#2 18T, b1=3000, b2=125000
#3 324T, b1=250, b2=40000
#4 composite: product of a 10 digit and 13 digit factor
#5 256P, b1=3500, b2=15000
#6 2P, b1=30000, b2=125000
#7 2E, b1=90000, b2=800000

most of these factors are quite much below the bounds.
If I had set the bounds twice a big, I probably would have found 1 or 2 more factors, but it also would have taken twice as long to test that range.

By the way. sbfactor does not correctly measure the time of stage 2. It is much too low (or does that only happen on ma computer?). That might be a reason that people choose such a big b2?

Maybe that might change a bit with bigger exponents.

good luck

biwema

Moo_the_cow
06-16-2003, 07:14 PM
quote:
--------------------------------------------------------------------------------
I think I agree with Joe on this one. I don't see the point in handicapping sieving or P-1 factoring or whatever. __________________________________________________

and

quote:
______________________________________________
I agree totally with what you are saying here
______________________________________________

Look here, at Garo's factor:
961094450858074349 | 67607*2^4022171+1

EVEN IF IT IS A DUPLICATE, it would score over 9 million on
Mike's sieve stats!! That's >7x more than what we have sieved
COMBINED! There's got to be a better way to score the factors,
or

:machgun: P-1 factoring

Nuri
06-16-2003, 07:17 PM
Biwema,

Are you using version 0.5 or 0.6?

It seems to me that version 0.6 measures the time correctly. Well, I tested it on with ecm paramter only, but it seemed right. If you mean version 0.5, you're right, it's time measurement was wrong.

garo
06-16-2003, 07:54 PM
It's an original factor - didn't find it in the latest files posted by Louie and the n was listed as uncleared in one of the lists posted recently.

It took me about 5 hours on an Athlon 1333 with bounds of 25000 2500000. Looking at the factorization of P-1 one can see that bounds of 6000 and 160000 would have sufficed.

961094450858074349 -1 = 2 ^ 2 x 47 ^ 2 x 61 x 2243 x 5107 x 155663

I looked at how much P-1 has been done on the numbers in the 4million range in GIMPS and how much the client currently tries to P-1 on a P4. Here is the result:

If the number is not tested and has been trial factored to 62 bits:
B1= 45,000 B2=753750

If it has been factored to 63 bits the bounds drop to

B1=40000 B2=640000

If it has already been tested once, ie in the doublecheck phase:

B1=20,000 B2=260,000


So, looks like biwema's estimate of keeping B2 about 10-25 times B1 seems to be correct as is his suggestion of raising B1. I'd suggest 50,000 and 1 million as reasonable bounds unless someone does an analysis.

[Edit: If we consider the fact that sieving has only reached 2^45 whereas GIMPS factors these numbers to 2^61, maybe raising B1 even more might be useful. But then we also have to consider the fact that GIMPS factors have a special form so their bounds can afford to be a little higher.]

jjjjL
06-16-2003, 08:11 PM
:) NEW SBFACTOR v0.8 :)

this version has the features I was talking about adding, namely:

-Working optimal bound picker
-SoB.dat decoder
-Ablility to scan lowresults.txt/results.txt for known factors

The usage for the optimal-bound/SoB.dat execution is:

sbfactor.exe SoB.dat lowresults.txt results.txt <n low> <n high> -x <how far factored> <double check?>

so if I wanted to P-1 factor all unfactored numbers between n=4000000 and n=4000100, I'd enter:

sbfactor.exe SoB.dat lowresults.txt results.txt 4000000 4000100 -x 45 0

here's a kinda tricky part the 45 there means that the numbers in that range have been sieved up to 2^45 = 35T. that's not exactly true but it will produce more accurate bounds than 44 since the majority of p < 35T is sieved.

the final 0 just means that the number is completely untested. if you wanted to try factoring a range of numbers around n=3M, you would want to set that to 1 to signify that the value of a factor is less since it will only prevent a double check and not a new test.

To make this REALLY easy to use, I've included a batch file that you only needs the bounds of a range of n that you want to test. This means you just download the new version, unzip it and type:

run 4010000 4010100

wait for it to finish, open up fact.txt and submit. :) another great thing is that even though there is no master file that tracks progress, the cache files will prevent retesting of numbers if you reboot your system or quit the program. like SB, the most you could lose would be 10 minutes. so just add another batch file to the directory that calls run.bat and add it to your startup folder and you could have your very own P-1 factoring system.

the new download comes packaged with SoB.dat, lowresults.txt, and results.txt so it is much larger than the last versions. It is the original dat file so if you want to factor ranges less than 3M, use a different dat file.

be warned though, the optimal bound setter will return B1=0 for numbers around n=510k. It looks at the likelihood that a factor will be found and immediately realizes that just testing the number is faster. it uses the same multiplication routines that SB does for squaring numbers so it doesn't need to even know how fast your computer is. it's pretty nice.

n=1M will also immediatly quit if you indicate there's already a test done for it (which you should). it starts returning non-zero B1 values around n=1.7M. For more reasonably sized tests around n=4M it returns things in the neighborhood of B1=80000, B2=440000. i'll post the code it's using for bounds setting as an attachment here so that people can examine it before we decide that it is what we want for "official" testing of different ranges. once we agree on the fine points of the bounder, it wouldn't be inconcievable for people to simply reserve small ranges of n (for all k) where they will let the factorer run with optimal bounds. the new system is pretty easy to use compared to manually editing batch files. it will get easier too.

one last thing I want to mention. the program now returns "success values" to help you estimate how may factors it will find. here's an example:



run 4100000 4105000

sbfactor.exe SoB.dat lowresults.txt results.txt 4100000 4105000 -x 45 0
SBFactor v0.8
P-1 and ECM factoring for number of the form k*2^n+1.
Adapted from GIMPS v23.4 by George Woltman and Louis Helm
Finished parsing SoB.dat
218 numbers between 4100000 < n < 4105000
Searching for known factors in lowresults.txt...Done.
Searching for known factors in results.txt...Done.
Removed 22 numbers using the factor files
Testing 196 numbers between 4100000 < n < 4105000
Expected number of factors for entire range: 4.173580
B1=80000 B2=440000 Success=0.021294 Squarings=200766
P-1 on 4847*2^4100151+1 with B1=80000, B2=440000
initializing test
sieve finished


So it expects that if I test all 196 unfactored numbers in that range to the optimally chosen levels, I should find about 4 factors. The expected number of factors is simply the sum of the success value of all 196 tests which are individually printed before each test starts.

Let me know how the new version works and what you think of the optimal bound setter.

http://www-personal.engin.umich.edu/~lhelm/sbfactor08.zip

-Louie

PS - I am actually doing the "run 4100000 4105000" range for those looking to avoid duplication. thanx.

Moo_the_cow
06-16-2003, 10:29 PM
Louie, could you include a "how to.txt" in the file that explains
how to use the client? It took me 2 hours (literally) to figure out
how to use the client the first time that it was released.

Joe O
06-17-2003, 01:20 AM
Louie,


Removed 22 numbers using the factor files
Testing 196 numbers between 4100000 < n < 4105000
Expected number of factors for entire range: 4.173580

Somewhere around here could you have the program print the 196 k n pairs that will be factored? or write them to a file? I think that it would be very helpful to have this info.

Joe O
06-17-2003, 09:16 AM
Louie,
Would it be possible for the program to check the last updated date of the results.txt file and reprocess it if it has changed? If this could be done between P-1 runs it would allow us to automatically eliminate factors found since the last time we downloaded the results.txt file.

smh
06-17-2003, 01:26 PM
PS: Using 25000 instead of 1000 together with ecm would probably make each test last ~4 hours.


Please note that ECM and P-1 are completely different things.

With ECM you need to run the program several times on the same number to have a reasonable chance to find a factor.

You'd need to run it about 25 times with B1=2000 to find most 15 digit factors. To find most 20 digit factors you need to run the program 90 times with B1=11000

Joe O
06-17-2003, 01:39 PM
Originally posted by smh
A few questions:

1) Is there a file available sorted by N with both K and N on one line? This would make it easier to reserve a range.

2) What bounds are you using?

3) maybe a moderator can make a reservation list like in the sieving topic and delete all messages below it? Just to make things more clear.

1) Not needed if you use version 8, available here. (http://www-personal.engin.umich.edu/~lhelm/sbfactor08.zip) Just choose an n range that you want to work on and post it in the cordination thread. (http://www.free-dc.org/forum/showthread.php?s=&postid=30123#post30123) Then type "run.bat n_min n_max" in the directory that you unzipped version 8 and press enter will get you started. e.g. "run 4080000 4085000"
2) Automatically chosen by the program. Look here. (http://www.free-dc.org/forum/attachment.php?s=&postid=30051)
3) That's what this is (http://www.free-dc.org/forum/showthread.php?s=&postid=30123#post30123). Ceselb/Louie will be along shortly to clear the extra posts.

For more information look back up to Louie's post. (http://www.free-dc.org/forum/showthread.php?s=&threadid=3160&perpage=25&pagenumber=4)

smh
06-17-2003, 01:40 PM
Louie:

the program still doesn't run at idle priority by default.

jjjjL
06-17-2003, 02:11 PM
Originally posted by smh
Louie:

the program still doesn't run at idle priority by default.

Yeah, I know. I already got told on the other thread. They reminded me of the whole idle-thing right after I finished v0.8.

I have made a new version that runs idle and also processes the result files much faster (0 seconds instead of 3 ;)).

I was going to throw a GUI wrapper around it and add progress bars instead of scrolling messages but it may take a day. I was going to build a new window interface with MFC but after an hour of working with it, I decided I still don't like MFC.

I'll probably just gut the SB GUI code and start there.

-Louie

ceselb
06-17-2003, 02:49 PM
Copied from the coordination thread.

Originally posted by cjohnsto
Or if you want to do it as part of a batch file use:
start /LOW /b /wait ..\ecm -c 500 250000 < in.txt >> out.txt

Of course use the program you want not ecm.

Does anybody understand what he means?

ceselb
06-17-2003, 03:06 PM
C:\sieve\sbfact>sbfactor.exe SoB.dat lowresults.txt results.txt 19999000 2000000
00 -x 45 0
SBFactor v0.8
P-1 and ECM factoring for number of the form k*2^n+1.
Adapted from GIMPS v23.4 by George Woltman and Louis Helm
Finished parsing SoB.dat
37 numbers between 19999000 < n < 200000000
Searching for known factors in lowresults.txt...Done.
Searching for known factors in results.txt...Done.
Removed 2 numbers using the factor files
Testing 35 numbers between 19999000 < n < 200000000 <-- takes several minutes at this stage. Could it be done quicker?
Expected number of factors for entire range: 2.425795
B1=650000 B2=16412500 Success=0.069308 Squarings=2020159
P-1 on 4847*2^19999071+1 with B1=650000, B2=16412500
initializing test
sieve finished

C:\sieve\sbfact>

Does it exit because of memory shortage for that large B range, or is it a bug? Not a big problem, since I'm not planning to factor anything there for a couple of years more. :D

Nuri
06-17-2003, 03:34 PM
May it be because you wrote 200000000 instead of 20000000? Just a thought.

ceselb
06-17-2003, 03:47 PM
Nope, re-ran it with one 0 less. Still the same result.

Nuri
06-17-2003, 03:57 PM
Try it with 19999999 instead of 20000000. It'll work. :thumbs:

Though I don't know why.

Nuri
06-17-2003, 05:45 PM
How long does a 4m P-1 test take on the average?

I started my first test, and it took 80 minutes to reach 38% of stage 1 (PIV-1700, CPU usage %99). Is this normal?

A quick calculation:

My range has 179 numbers and 5.5 expected factors. Assuming it would take 4 days to finish a 4m prp test, testing 5.5 candidates would mean 22 days. Adding double check for all, and 10% non-matching residues, it would take 46.2 days to get rid of them completely.

For P-1 to be effective in terms of eliminating candidates (on a PC basis), it should finish each P-1 test in less than 6.2 hours (=46.2*24/179).


Another question:

How many boxes should we use at P-1 not to be caught by the main project? (Finish at least the same sized n range per day?)

Currently, the main project finishes ~300 tests per day. Assuming a pc can finish a P-1 tests at n=4m in 6 hours, that would suggest we should use 75 PCs 7/27 on P-1 not to be caught (=300/(24/6)).


Am I wrong somewhere? Or is it still not (or hardly) feasible to use P-1 at n=4m level? (given the current relative client speeds and sieve depth)

Joe O
06-17-2003, 06:53 PM
I've only finished 1 P-1 k n pair with version 08 in the 4.075M range. It took just 4 minutes short of 14 hours to complete. This is on a PIII/500 running Win98SE.

Can someone tell me how long a 4.075M PRP would take on this machine?

I've just finished the 2nd P-1 k n pair with version 08 in the 4.075M range. It took just 3 minutes short of 13 hours to complete. This is on a PIII/500 running Win98SE.

I've just finished the 3rd P-1 k n pair with version 08 in the 4.075M range. It took just 3 minutes short of 13 hours to complete. This is on a PIII/500 running Win98SE.

I've just finished the 4th P-1 k n pair with version 08 in the 4.075M range. It took just 7 minutes short of 13 hours to complete. This is on a PIII/500 running Win98SE.

garo
06-17-2003, 08:05 PM
Yes, I'm getting similarly long times. I got 7 hours to finish a test on a P4-2533. Given that the chance of finding a factor is 3% this implies 7*33 /24 = 9 days to finish two PRP tests - one for double checking. Still that seems pretty high.

I think the bounds calculator is a bit screwy and is giving bounds that are too large. I experimented with -x 50 and the bounds and the time to finish seems much more reasonable.

Any comments Louie?

cjohnsto
06-17-2003, 09:14 PM
start /LOW /b /wait ..\ecm -c 500 250000 < in.txt >> out.txt

Does anybody understand what he means? [/B]

OK, I wrote that quickly but i thought it was not too bad.

start tells windows to run a command.
/LOW tells it to run it at low priority /b and /wait make it use the current window and wait to finish execution before continuing.

The you type the command you want to run after that with any options it needs.

Why do this?

It allows the program to start in whatever priority you want with using taskmanager to change it. So put the line in a batch file and you can use it to start the program when windows starts without worrying about changing priority manually.

jjjjL
06-18-2003, 05:38 PM
:) SB Factor v0.9 :)

improvements include:
-runs at idle priority
-parses lowresults.txt/results.txt much quicker
-allows to parse out larger ranges (10k # limit instead of 1k)
-now includes the lower bound inclusively

idle priority :thumbs:

faster parsing :thumbs:

the larger range although not useful for actual work may be useful to estimate the total number of factors in large areas since you can estimate ranges of around 200000 now. it is possible to do this now because of the aforementioned faster parsing. :thumbs:

and the inclusion of the lower bound is just a technical corner case to make sure we don't miss numbers. right now, it would be possible to miss numbers in between two peoples ranges where they meet. i did a quick check and there are none missed yet but this will make it so we don't have to constantly check for n values on the boundries.

i noticed garo is messing with the factorlimit to manipulate the optimal bound setter. i think a more reasonable way to do that would be to set the double check flag. this would lower the bounds in a more predictable way. if you can, i'd like for you to go though the code i posted and recommend a way to make the program bound checker truely optimal. the formula are pretty easy to follow. let me know.

anyway, here's the new version:
http://www-personal.engin.umich.edu/~lhelm/sbfactor09.zip

-Louie

garo
06-18-2003, 06:14 PM
Hi Louie,
You are right -x 50 was not the right way. In fact I did not employ that technique after all and used the doublecheck flag as you have just suggested.

Using double-check on lowered B1 by a little over half and B2 by 2/3. B1: 105000 -> 45000 and B2: 1735750 -> 573750
The chance of finding a factor per exponent dropped from 3.1% to 1.9% and the time required to complete a test dropped by 70% or more - don't have exact numbers.

My experience with p-1 in GIMPS suggests that bounds depend not only on how far factored and doublecheck or not but also on machine type/speed and on memory available. for instance P4s usually give higher bounds for the same numbers than other types of computers.

I also looked at the code and the only thing I can suggest changing seems to be the constants. I think you might need to do a recalculation to get better constants for the number of squarings required etc. Remember that GIMPS factos have to be of the form 2kp+1 whereas the same does not hold for SoB. So the constants will certainly be different.

cjohnsto
06-18-2003, 08:38 PM
What is the benifits of using P-1 factoring instead of ECM?

Besides that we have P-1 implemented.

jjjjL
06-18-2003, 08:51 PM
i took out the 2 * p factor already. there's a comment where i removed it.



/* What is the average value of k in 2kp+1 s.t. you get a number */
/* between 2^how_far_factored and 2^(how_far_factored+1)? */

kz = 1.5 * pow (2.0, how_far_factored); // / 2.0 / p; removed by Louie since we have no 2*p in proth


i'm not positive what the factor of 1.5 signifies. i'm fairly certain we should still have it.

the fft length calculating code used by gimps is replaced by the prp code. so in theory, those two things are correct. the table of gcd constants should be the same since it's only based on the size of the number being factored. i justed checked the gcd estimation code and it's correct.

so what we might want to do is something that wblipp suggested: reduce the double check constants a little. it's likely that we won't completely double check some numbers. instead of valuing a factor as much as 2*(error rate+prp time), which equals 2.1, perhaps it should be closer to 1.5 or so. enabling the double check factor basically does this by reducing the value of a factor to prp time + 2*error rate = 1.1. if someone has a logical reason for believing this factor should be something precise, please explain. if not, i'd recommend we make it 1.5. if it's going to be arbitrary, it should at least be simple. ;)

-Louie

Joe O
06-18-2003, 10:46 PM
Louie,
Shouldn't the doublecheck flag come from the lowresults.txt and results.txt files NumTests entry?



p k n UserID TeamID NumTests
3000032566681 55459 15805426 1 3 0

EDIT:
I think that we should keep the parameter on the command line, and call it "relax constraint" parameter, or something like that. Allow it to range betweeen -9 and +9, and add it to the NumTests entry for that test. This would mean different B1 and B2 for each test based on the NumTests value, and an overall bias for the run based on the "rc" parameter. If you want to run fast, lower the chance of finding a factor by lowering the B1 and B2 values then you would make this a 1 or 2 or 3 ... or 9. If you have more time and want to raise the chance of finding a factor by raising the B1 and B2 values then you would make this a -1 or -2 or -3 ... or -9.

wblipp
06-18-2003, 11:13 PM
Originally posted by jjjjL
if someone has a logical reason for believing this factor should be something precise, please explain.
-Louie

Suppose our error rate, presently zero, becomes 1%. At this point the earliest time it makes sense to do a double check would be when the new exponents are 3.8 times the double check exponents. At this point the new tests take about 26 times as long (estimated as [x*ln(x)]^2 in the Resource Allocation thread). Primes are scarcer by a factor of 3.8 in the new exponent range because the number of primes is proportional to 1/ln(x), the same proportion as 1/n. Hence it is expected to take 26*3.8=100 times as much work to find a prime among the new exponents as it would take if the double check exponents were untested. This balances the 0.01 chance there was an prime with an error in the lower range.

I think that 3.8 is too soon when you consider the fact that primes were found for other k's, but for the time being let's ignore that.

If we adopt a "3.8 rule," then at the time a prime is found we will have double-checked from zero to "z" and tested from 0 to 3.8z. The fraction of double tests will be 1/3.8 = 26%. Hence I propose that the correct factor to account for double tests is 1.26.

Joe O
06-18-2003, 11:21 PM
Originally posted by wblipp

I think that 3.8 is too soon when you consider the fact that primes were found for other k's, but for the time being let's ignore that.

If we adopt a "3.8 rule," then at the time a prime is found we will have double-checked from zero to "z" and tested from 0 to 3.8z. The fraction of double tests will be 1/3.8 = 26%. Hence I propose that the correct factor to account for double tests is 1.26.

So use a "4.0" rule for simplicity's sake. This gives us 1/4 = 25% for the fraction of double tests, and a factor to account for double tests of 1.25. This is in the spirit of Louie's "arbitrary but simple" criterion.

wblipp
06-19-2003, 10:14 AM
On further reflection, the proposal of 1.25 may be too low. While it's true that "a rule of 4" means that when we find a prime we will have double checked 25% of the results, there is the question of when we will reach the prime. Today we are handing out first tests around n=4 million. Under a "rule of 4" these will be double checked unless we find a prime for this k-value before n=16 million. Using Proth Weights to estimate this, the probability of NOT finding a prime between 4 and 16 million varies from 61% to 87%, averaging 73%. So if we adopt a "rule of 4." about 75% of the numbers around n=4 million will be double-checked. If we go to a "rule of 10", it drops to 60% double-checking. To get to 50% double-checking, we would need to use a "rule of 25."

A "rule of 25" would be appropriate if our error rate turns out to be 6*10^(-6). A "rule of 10" would be appropriate for an error rate of 2*10^(-4). I'm still not sure how the knowledge of other k values affects this; I'm pretty sure that having found some primes makes it less likely we missed primes for the other k-values.

I guess I like "stick with 1.5 until we collect some error data."

Joe O
06-19-2003, 08:03 PM
Originally posted by wblipp
I guess I like "stick with 1.5 until we collect some error data."

Sounds good to me too!

Louie,
The P-1 program currently process all the k n pairs for 4847 before going on to the pairs for 5359 etc. Would it be possible to sort the k n pairs by increasing n regardless of k? This would allow those people just ahead of the PRP effort to have a better chance of remaining ahead.
Thanks,

jjjjL
06-19-2003, 10:04 PM
Originally posted by Joe O
Louie,
Shouldn't the doublecheck flag come from the lowresults.txt and results.txt files NumTests entry?


can't do that since the results.txt file doesn't have unfactored numbers in it. it's just using the result files to know which numbers it shouldn't check at all. maybe i'll make the last number just equal the factor value instaed of being a flag. that way you can decide how many prp tests of the number you feel a factor is worth. closer to 1 for speed, closer to 2 for more factors.

i'll see if i can reorder the tests so it steps though the ranges by n value instead of k/n value. that's a good idea.

-Louie

jjjjL
06-20-2003, 01:54 PM
i haven't done exact timings, but it may pay to skip the stage 1 gcd. it takes as long as the second stage gcd and has much less chance to find a factor. it will only prevent the seconds stage on around 1% of the tests.

what do you think?

-Louie

ceselb
06-22-2003, 10:06 AM
Just found a factor

626134072137207677 | 28433*2^4150417+1

2^2*7*4003*8761*17351*36749+1

Also found a program to calculate prime factors
Source (ftp://ftp.rsok.com/pub/source/) and binaries for both dos (ftp://ftp.rsok.com/pub/source/dos/bfactor.exe) and linux (ftp://ftp.rsok.com/pub/source/linuxbin/bfactor)

jjjjL
06-22-2003, 02:31 PM
Originally posted by ceselb
Just found a factor

626134072137207677 | 28433*2^4150417+1

2^2*7*4003*8761*17351*36749+1

Also found a program to calculate prime factors
Source (ftp://ftp.rsok.com/pub/source/) and binaries for both dos (ftp://ftp.rsok.com/pub/source/dos/bfactor.exe) and linux (ftp://ftp.rsok.com/pub/source/linuxbin/bfactor)

nice find. :thumbs:

i factor my P-1 factors with this page http://www.alpertron.com.ar/ECM.HTM.

-Louie

ceselb
06-22-2003, 02:40 PM
Using above program and a nice little DOS utility called FU (http://users.aol.com/dosware/fu.zip) , I've compiled a list of most prime factors from the new results.txt (http://www.seventeenorbust.com/sieve/results.txt.gz) (up to 10^15, leaving the last 40 or so).

I don't have a good statistics program or the knowledge, but maybe someone can figure out more optimal bounds with this data?

The file can be found here (www.visit.se/~ceselb/SBprimefactors.zip) (770kb).

Also found another factor :D
4842232354228897 | 67607*2^4150107+1

2^5*3^2*11*257*53629*110899+1

Mystwalker
06-23-2003, 10:34 AM
Seems like the first tests of my range get assigned in approx. 5 days (and the last only ~5 hours later).

Question:
It is wise to continue factoring k/n-pairs the time they already got assigned? As it only saves a double check then, usefulness will be less than half of what it was before...

btw. what is the best PC architecture for p-1 factoring?
P3/Athlon or P4? Does it prefer a high FSB?

jjjjL
06-23-2003, 02:35 PM
Originally posted by Mystwalker
Seems like the first tests of my range get assigned in approx. 5 days (and the last only ~5 hours later).

How did you figure that out? you may be right but I don't know, it's very hard to tell right now because of the fact that a few very high tests got assigned because of the TC incident a few weeks ago. the real upper edge of testing is much lower than 4M. I'd guess you have >5 days.


Originally posted by Mystwalker

Question:
It is wise to continue factoring k/n-pairs the time they already got assigned? As it only saves a double check then, usefulness will be less than half of what it was before...

i would say no. it's wise to pick a range that's at least near to being tested in the next month or so but it'd be equally wise to try and get one you can finish on time for maximum benefit.


Originally posted by Mystwalker

btw. what is the best PC architecture for p-1 factoring?
P3/Athlon or P4? Does it prefer a high FSB?

The factoring code uses the same squaring routines that SB does so procs that are good for SB are good for factoring. This means fast FSBs are good and P4s are good too. The only proc that has an overwhelming advantage in factoring is the P4 so once again, i'd recommend any of those should stick with SB (or at least P-1) as opposed to sieving.

-Louie

Mystwalker
06-23-2003, 03:06 PM
How did you figure that out? you may be right but I don't know, it's very hard to tell right now because of the fact that a few very high tests got assigned because of the TC incident a few weeks ago. the real upper edge of testing is much lower than 4M. I'd guess you have >5 days.

Hm, did a mistake in my calculations... :(
Ok, I have more than 5 days left then. :elephant: But it seems like I need at least all the time I have available...

jjjjL
06-23-2003, 03:08 PM
i figured out the issue that was making the LINUX code crash.

i was testing on only P4s and the SSE2 code was unaligned.

a working linux version of SB factor is up in my personal space on the Umich servers now:

http://www-personal.engin.umich.edu/~lhelm/sbfactor.gz

-Louie

Mystwalker
06-23-2003, 03:11 PM
Ah, one request:

Could you tell me how many tests there still are to be handed out before it hits n=4073339?
So I can time the crunching to be (hopefully) ready in time...

garo
06-23-2003, 06:16 PM
I looked at the data that ceselb had posted. Unfortunately it is not very useful in figuring out optimal bounds as most factors are not P-1 smooth.

For instance, a B2 bound of 1 million would have captured 26% of the factors in the file and a bound of 2 mill would have captured 30% of the factors.

Similarly if only stage 1 was done, a bound of 50k would have captured 9% of the factors, whereas 100k would have captured 12.6% and 200k would have captured 16.3% of factors.

So, that then is the distribution of the final factor of p-1. The penultimate factor distribution is much better in that with B2 = HUGE, a B1 of 25k would capture 88% of the factors, 50k -> 91.6% and 100k->94.6%

From this limited data, I would suggest that we keep B1 low, about 25k and keep B2 as high as possible.

But then this observation is just based on empirical data and may be way off the mark.

ceselb
06-24-2003, 01:40 PM
2 more factors found.

273148210774616633431 | 5359*2^4151886+1
2*3*5*13^3*29*6599*87103*248621+1

284412835641643 | 21181*2^4151612+1
2*3*19*61*71*2953*195071+1

Am I the only one finding any factors from the coordination thread? I'm using optimal bounds, making a test take 5.5h.

Mystwalker
06-24-2003, 01:52 PM
I have completed only 2 or 3 tests so far. But 2 are almost ready.
When does the program get the factor (if there is one within the bounds)? Always at the end of the run (read: step 2 100% done) or inbetween?

mklasson
06-24-2003, 02:00 PM
Originally posted by Mystwalker
When does the program get the factor (if there is one within the bounds)? Always at the end of the run (read: step 2 100% done) or inbetween?
After the GCDs (at the end of stage 1 and stage 2).

As for finding factors, I got one last night:
155994863214049 | 10223*2^4081145+1
2^5*3*509*16547*192931+1

I'm also using the optimal bounds, -x 45 no double check. One test takes about 2h15m on my 2GHz athlon.

Nuri
06-24-2003, 04:05 PM
Originally posted by Mystwalker
I have completed only 2 or 3 tests so far. But 2 are almost ready.
When does the program get the factor (if there is one within the bounds)? Always at the end of the run (read: step 2 100% done) or inbetween?
It also creates a fact.txt file in the folder it is run at. Not yet for me after 7 completed tests. :(

I'm also using the optimal bounds, -x 45 no double check. One test takes about 2h15m on my 2GHz athlon.
BTW, mklasson, you mean 0 by no double check, right? If that is the case, it means that PIVs should not be used for P-1 as well, because it takes ~6 hours per test in my PIV-1700 (and garo previously wrote 7 hours to finish a test on a P4-2533).

mklasson
06-24-2003, 04:28 PM
Originally posted by Nuri
BTW, mklasson, you mean 0 by no double check, right? If that is the case, it means that PIVs should not be used for P-1 as well, because it takes ~6 hours per test in my PIV-1700 (and garo previously wrote 7 hours to finish a test on a P4-2533).
Yes, 0. But the optimal bounds calculator seems to give different B bounds for different processors. Xrillo's P4-2400 gets B1=100000, B2=1575000 assigned while my athlon gets B1=80000, B2=440000 for the same range. Despite the bigger B bounds, his P4 _seems_ to take about as long as my athlon to complete a test. We'll see once he's done with one...

Just found another factor btw :)
43473882553589 | 22699*2^4081150+1
2^2*41*257*5657*182333+1
I've found 2 factors in 8 tests so far. Not bad when the estimated total was 0.38 factors found in 18 tests. Is the estimator fishy or am I just lucky?

jjjjL
06-24-2003, 04:42 PM
Originally posted by mklasson
Is the estimator fishy or am I just lucky?

You're just incredibly lucky. For comparison, I have done 42 tests on my main system and found 0 factors. I have also run 10 more tests on a couple P4 linux machines. 0 factors found there too.

Considering each test has a probablity of around 2%, I'm only slightly unlucky at this point to have not found a factor yet.

You and ceselb should go buy a lotto tickets. :D

-Louie

MikeH
06-24-2003, 04:58 PM
It is interesting to look at the relative lack of factor finds from P-1 factoring to date. Active factoring at n>4M has been ongoing for more than one week, yet the bottom table in the sieving stats shows no noticable difference between 4M<n<5M and any other similar band of n.

I guess this either shows that not much effort is being dirrected towards P-1 factoring, or it is not yielding factors at a worthwhile rate. Which do we think this is?

I totally agree that we need to experiment with any and all ways of removing candidates, but right now it would seem that sieving is much more efficient.

garo
06-24-2003, 06:00 PM
I think that the current bounds are suboptimal. That is probably the reason why factors are not being found fast enough. And of course, sieving is still more efficient. But P-1 gives us the opportunity to try and find factors for numbers about to be tested so that's good!

Xrillo
06-25-2003, 04:54 AM
Originally posted by mklasson
Xrillo's P4-2400 gets B1=100000, B2=1575000 assigned while my athlon gets B1=80000, B2=440000 for the same range. Despite the bigger B bounds, his P4 _seems_ to take about as long as my athlon to complete a test. We'll see once he's done with one...

Well, my computer finished a test during the night. It took 29570 seconds ( roughly 8 hours and 15 minutes ) at 50% cpu. No factor found. :(

For those who really want to know:
my computer is a p4 2.4GHz with 512MB 333MHz RAM. I guess the main bottleneck is the speed of my RAM...and the fact that I
bought an intel m-board not letting me run the cpu at higher speeds. Oh well, at least it is very stable...

As for the "p-1 factoring is less efficient than sieveing"-part -- I don't run the sieve anymore since I realized it was running 2-3 times faster on AMD based computers (since I run a p4). The factoring however is based on code from gimps...giving my p4 computer a slight boost over AMD based ones.

Oh, and thanks for the new PRP client. I'm giving it the remaining 50% of my cpu. :)

jjjjL
06-25-2003, 06:29 AM
Originally posted by garo
And of course, sieving is still more efficient. But P-1 gives us the opportunity to try and find factors for numbers about to be tested so that's good!

This is the crux of the "efficiency" arguement. Yes, on a quantity basis, sieving beats factoring. But on a quality basis, I'd say factoring has a lot going for it.

For now, sieving still produces 1 factors every 1.2G, but the value of each factor is very difficult to determine.

As an example: How important was it to the project that I recently discovered that 10223*2^19993577+1 is divisible by 18201366731629? I have no idea. But it did take my system around 2 hours to discover just that one factor by sieving. This saved a test that would have been assigned so far in the future, that no one can accurately guage its value.

Compare that to the value of a P-1 factor that saves a test a few days away from being assigned. That we can place a value on. I know that each factor garo found will prevent a test that would tie up the average SB user's machine for about a week. And his factors are so large, that no amount of sieving would have found the same factors in time.

I'd go a step further and say that no amount of sieving would save a similar amount of tests in time. And by that, I mean a computer that sieved for a month would not find as many factors for numbers that would be assigned in the following month as a computer that did P-1 factoring for that same month.

For sake of argument, lets say that my last conjecture is wrong. Sieving really is more effiecent in the scope I defined. Asymtotically, it still works out that even if my statement were wrong today (which i don't think it is), it will eventually be right after some small amount of additional sieving. That's because the factor density is falling off exponentially. So even though we currently need to sieve 1.2G for each factor, when the sieve reaches 50T it will require more like 2.2G/factor. Then when it reaches 250T, we'd have to sieve over 30G just to find one factor. Some people still don't even check out ranges that large. Even if you count on Paul to somehow double the speed of SoBSieve again, you're still to a range where sieving can't find factors much quicker than P-1 factoring.

My point is that no comparison between sieving and factoring can avoid the issue of "factor quality". If the factor doesn't help SB in anything except the very-long term, I'd say it can't be compared to a factor that helps SB right now. That means that any realistic comparison needs some threshold or some gradient that accounts for that fact that not all factors are equally important to discover right now.

A possible way to compare would be to only count numbers in the next 1M n-space. This would mean a reduction in sieve to factoring ratio by 1/17 = 6%. I'd say that a 0.5M n-space window would be more realistic and a 2-3M n-space window much too large. This would mean that to realistically compare a sieve rate to a factoring rate, you'd have to divide the sieve production by something between 15-25.

Another way to look at it is to redefine what you think about the siever. The normal way to think about it is "it's a program that finds factors between n=3M and n=20M" but I think a better way to think about it is that "it's a program trying really hard to find factors around n=4M, but most of the time it fails and produces extra potentially useful factors that we don't need to know right now" or "a program that finds the factors I need right now rather slowly, but for a small performance hit, it also find a lot of factors I don't need to know right now too". basically, it's a cruel way to express the fact that the sieve isn't finding factors in the order that they would be the most beneficial to the project. ideally, the sieve would be able to find all the factors in smaller windows of n-values and then we'd move on to the next one and so on. but it doesn't... it finds factors over a huge range of n, all spread out. we need to recognize the benefit that factoring allows us in being able to focus our effort more effectively than sieving. This has incredible value.

-Louie

Thommy3
06-25-2003, 08:20 AM
yeah, really no need for factors beyond 7M or even lower so, louie... so here's my question: how much faster would be nomal sieving by using a sob.dat from 4-5M and not 3-20M

of course that would mean more sieving effort, because its nothing worth for all numbers over 5M, but think about it. or even better: could someone test it out, by making a small sob.dat.

Thommy

Mystwalker
06-25-2003, 08:37 AM
effort is sqrt(n), so this means it is sqrt (17) = 4,12 times as fast.

mklasson
06-25-2003, 08:43 AM
I just tried this with sobsieve and a custom 4M-5M sob.dat. It was about 34% faster on my 2GHz athlon.

Found a new factor overnight as well. Yay :)
31734311619887 | 21181*2^4081700+1
2*3413*63863*72797+1
Success in stage 1 saved an hour.

jjjjL
06-25-2003, 07:50 PM
i've found a couple bugs in the optimal bound checker.

for one, it overestimating the cost of each gcd. even though it was technically wrong, it doesn't seem to dramatically effect the the bounds finally chosen. in fact, it doesn't change them at all for my range.

another thing was sorta bigger. the current client uses 60MB of mem by default in stage 2 for temp space. however, the bound guesser was assuming that it only had 24mb avilable. this makes it drastically underweight the B2 limit. what's more, if you increase the mem used to more like 128MB, it raises the B2 limit (and the factor prob) much higher still. this is because it will be able to process each prime in the B2 stage faster with more temp space.

here's a quick comparison of P-1 on 10223*2^4150085+1

24MB (current bounder):
B1=55000 B2=288750 Success=0.017117 Squarings=135901
60MB (what the current bounder should currently return):
B1=65000 B2=601250 Success=0.021802 Squarings=159729
128MB:
B1=65000 B2=731250 Success=0.022867 Squarings=161565
256MB:
B1=70000 B2=910000 Success=0.024605 Squarings=174113
512MB:
B1=70000 B2=980000 Success=0.025033 Squarings=176561
1024MB:
B1=70000 B2=980000 Success=0.025033 Squarings=175109
2048MB:
B1=70000 B2=997500 Success=0.025134 Squarings=175734

Do I think people should use 2GB of mem? No. But the current bounder is definately cuting short the potential of the program by being so stage 2 adverse due to a percieved lack of mem.

The next version only has one memory value and it will be entered on the command line instead of assumed by the program. People can chose whatever they want but I wouldn't run it with less than 128MB.

So the next version will allow you to decide exactly how much you value a factor (not just the double check flag) and will let you pick the amount of mem you can use. it also displays the type of processor it detects.

anyone else have ideas before i compile v1?

-Louie

garo
06-25-2003, 08:21 PM
I am still convinces that the bounds estimation routine is overestimating the cost of an LL test. I wish I could be more specific but I've looked at the common.c code several times and I cannot see what is wrong.

smh
06-25-2003, 10:56 PM
I am still convinces that the bounds estimation routine is overestimating the cost of an LL test.

Might it be that it also takes a double check and a small error factor into account?

Moo_the_cow
06-25-2003, 11:22 PM
I suggest that the status of P-1 factoring be saved. What I mean
is that when I am in stage 1 23% complete, and I close the program and start it again, it starts over at 0% complete. When I do this with GIMPS, the program starts over at 22.7 % instead of 0 % like SB.

garo
06-26-2003, 02:47 AM
smh,
I think it is overestimating even after taking that into account.

As an example, my P4 2533 takes 76 minutes to do a P-1 with the doublechecking flag on i.e. when it should not take the doublecheck into account. The prob of finding a factor here is 1.9% which means the average time to find a factor is 3900 minutes. This is 65 hours. I think a P4 2533 will finish a 4M test in less than 65 hours, don't you?

jjjjL
06-26-2003, 04:38 AM
Originally posted by garo
I think it is overestimating even after taking that into account.


Looking deeper at the code, I think I agree. The bound checker is doing a direct squaring to squaring comparison. It chooses the number of squaring operations it will do (aka the bounds it picks) based on the expected # of squaring operations the prp test will take. A prp test normally takes "n" squarings where n is the exponent in the proth number. However, I included some code in the bound checker that resized n to correspond to the FFT size it uses compared to GIMPS (twice as big). But since both the squaring routines (SB and SBFactor) already use the double-sized FFT, I was basically accounting for it twice.

I'll be releasing a new version soon that fixes this so that the bound guesser is correct. Until then, at least set the double-check flag so that it only values a factor at around 3x the value of a prp test instead of 5x. This definately explains all the "why is it taking longer to find factors than run tests?" comments a few people made recently.

The new version will have much better bound picking. Test times will drop significantly assuming people use a more modest 1.2-1.5 factor valuation. Thanks for your comments and data. I'll post the new version as soon as it's ready.

-Louie

Mystwalker
06-26-2003, 04:40 AM
What I mean is that when I am in stage 1 23% complete, and I close the program and start it again, it starts over at 0% complete.

That's strange. I don't have that problem...

jjjjL
06-26-2003, 07:09 AM
:) SBFactor v1.0 :)

http://www-personal.engin.umich.edu/~lhelm/sbfactor10.zip

New features:
-optimal bound guesser not so incredibly wrong
-"double check?" flag replaced with factor value
-can use custom amount of memory
-command line options simplified
-potential bug with ECM curve counting fixed
-better command line help
-linux version included
-run.bat now checks for command line arguments

The bound guesser was just plain off. This version should be noticably more in line with the reality of how long prp tests actually take. It is even possibly correct now. ;)

The double check flag is removed. Now you can directly tell it how many prp tests you feel a factor is worth (instead of just getting to pick between 1.1 and 2.1). I'd say it should be over 1 and no more than 2. 1.5 is the default in the run.bat/run.sh.

The memory thing is important. I would recommend using a minimum of 128MB. 256 is better if you can. The more memory it has for stage 2, the more temp space it can use and the faster it will run. The amount of memory you allow it to use will also effect the B2 limit it chooses since it knows to do more stage 2 testing when it becomes more efficient.

The other main thing is that the command line options have been changed. If you use run.bat/run.sh it doesn't matter, but otherwise, run the program with no command line options to get an explaintion of the new format. Basically I just removed the random "-x", "ecm", and the filenames. I can't imagine anyone will miss them.

The bounds chosen are now lower so expect tests to run much faster (and have smaller success values). Overall, the tradeoff should increase factor throughput. Right now, I'm testing the linux client on a P4 2.26GHz. It can do each test in 30 minutes. At that rate and with my expected factor values, it should find a factor about every 60 hours. I don't think it could do a single prp test in that time, much less the 1.2 I'm weighing it against.

Upgrade, experiment, and let me know what you think.

http://www-personal.engin.umich.edu/~lhelm/sbfactor10.zip

-Louie

mklasson
06-26-2003, 10:11 AM
Louie,

very nice new client! Feels good to put some more memory to use.

I thought I'd reiterate some of the suggestions that's been said before:

- process the range in order of increasing n instead of k. Would make splitting ranges much easier.

- ability to print a list of all the numbers that are to be tested in a specified range.

- also, I think it would be nice to be able to specifiy the B1 and B2 bounds manually even when running in "range mode". This would be useful if I'd like to, say, increase B2 somewhat after a range is processed and then do a quick pass through it again.

EDIT: Oh yeah, I think there's something not quite right about the memory usage. Even though I give sbfactor 600MB, task manager only ever says it uses 450MB. I've got 1GB in the machine, so that's not the problem. At startup, sbfactor reports 600MB granted to it.

jjjjL
06-26-2003, 03:48 PM
Originally posted by mklasson
- process the range in order of increasing n instead of k. Would make splitting ranges much easier.

I know. Didn't make it into the new version. Next time.


Originally posted by mklasson
- ability to print a list of all the numbers that are to be tested in a specified range.

That wouldn't be hard to add. OK.


Originally posted by mklasson
- also, I think it would be nice to be able to specifiy the B1 and B2 bounds manually even when running in "range mode". This would be useful if I'd like to, say, increase B2 somewhat after a range is processed and then do a quick pass through it again.

That sounds kind of cool. OK.


Originally posted by mklasson
EDIT: Oh yeah, I think there's something not quite right about the memory usage. Even though I give sbfactor 600MB, task manager only ever says it uses 450MB. I've got 1GB in the machine, so that's not the problem. At startup, sbfactor reports 600MB granted to it.

I think that's normal. I didn't actually test it with more than 256MB but there's nothing saying it will use the max mem. It also has to allocate it in 3 goofy sized chunks so that's probably why.

here's how it does it:


/* Compute how many values we can allocate */

unsigned long choose_pminus1_numvals (void)
{

/* Compute the number of gwnum temporaries available */

return ((unsigned long)
(((double) memory * 1000000.0 -
(double) map_fftlen_to_memused (FFTLEN, PLUS1)) /
(double) gwnum_size (FFTLEN)));
}

void choose_pminus1_plan (
unsigned long B, /* Stage 1 bound */
unsigned long C, /* Stage 2 bound */
unsigned long numvals) /* Returns max number of temps */
{

/* Handle case where there is no stage 2 */

if (C <= B) {
D = 0;
E = 0;
return;
}

/* Handle case where we are very low on the number of temporaries available */

if (numvals < 12) {
D = 210;
E = 1;
goto done;
}

/* Try various values of D until we find the largest D that doesn't use */
/* too much memory */

D = (unsigned long) sqrt (C-B) / 2310 + 1;
if (D > numvals / 480) D = numvals / 480 + 1;
D = D * 2310;
for ( ; ; ) {

/* We guess at the best E for a given D */

if (D <= 180) E = 2;
else if (D <= 420) E = 4;
else if (D <= 2310) E = 12;
else if (D <= 6930) E = 30;
else E = 48;

/* See if this combination of D and E will fit in memory */

if (D * D < C + C) {
if (D >= 2310) {
if (D / 2310 * 480 + E + 4 <= numvals) break;
} else if (D >= 210) {
if (D / 210 * 48 + E + 4 <= numvals) break;
} else {
if (D / 30 * 8 + E + 4 <= numvals) break;
}
}

/* Try next smaller value of D */

if (D > 2310) D = D - 2310;
else if (D > 210) D = D - 210;
else if (D > 30) D = D - 30;
else break;
}

/* Allocate more memory */

done: free (nQx);
free (eQx);
free (pairings);
nQx = (gwnum *) malloc ((D>>1) * sizeof (gwnum));
eQx = (gwnum *) malloc ((E+1) * sizeof (gwnum));
pairings = (char *) malloc ((D + 15) >> 4);
}


-Louie

jjjjL
06-26-2003, 04:31 PM
YEH! I found a factor :D

238525666162151 | 5359*2^4104862+1

238525666162151 = 2 x 5 ^ 2 x 17 x 277 x 22037 x 45971 + 1

I just switched my range over to the new SBFactor v1.0 this morning so I'm lucky to have found this factor considering B1 was only 25000. This is probably about as unsmooth a factor as we'll ever hope to find.

-Louie

ceselb
06-26-2003, 05:00 PM
Strange, my B1 was 15000 when I weighing it against 1.2 (B2=138750) . Had to use 1.5 to get 25000 (B2=256250).

This is on a PIV1.5 using 256Mb.

jjjjL
06-26-2003, 05:23 PM
Originally posted by ceselb
Strange, my B1 was 15000 when I weighing it against 1.2 (B2=138750) . Had to use 1.5 to get 25000 (B2=256250).

This is on a PIV1.5 using 256Mb.

I am using 1.5 too (w/ 256MB) so that makes sense. Your processor has nothing to do with the bounds it picks. If you use identical factor limit, factor value, and mem on two completely different machines but on the same numbers, it should pick the same bounds since the bound guesser doesn't actually look at what your processor is (only the internal multiplications routines look at that.)

-Louie

prime95
06-26-2003, 08:42 PM
Two observations:

Sbfactorer does not seem to be based on the v23.4 gimps sources. Stage 2 was rewritten two months ago to use more memory if it can, use fewer multiplies, and to be more efficient in low memory situations.

There was a bug fixed where sub-optimal bounds where picked for the P4. An Athlon and P4 should pick very close or identical bounds.

Don't expect great improvements, but every little bit helps.

jjjjL
06-26-2003, 09:37 PM
Originally posted by prime95
Two observations:

Sbfactorer does not seem to be based on the v23.4 gimps sources. Stage 2 was rewritten two months ago to use more memory if it can, use fewer multiplies, and to be more efficient in low memory situations.

There was a bug fixed where sub-optimal bounds where picked for the P4. An Athlon and P4 should pick very close or identical bounds.

Don't expect great improvements, but every little bit helps.

You're right. It is using v23.4 objects but the main code in ecm.c isn't updated. I'll update it.

-Louie

garo
06-27-2003, 04:20 AM
Finally!!
Found a factor

127404468919821091 | 19249*2^4003538+1

127404468919821090 = 2 * 3 * 5 * 19 * 67 * 71 * 227 * 2731 * 75793

ceselb
06-27-2003, 07:21 AM
This is insane, just found another factor.

1518196112941553 | 55459*2^4151566+1

2^4*11*307*69857*402223+1

This was the last test I ran before 1.0 and lower bounds. Very lucky indeed. :D

5 factors in 46 tests. :banana:

Running the new test with very low bounds to try to do as many as possible before prp catches up in a few days.

Nuri
06-27-2003, 12:21 PM
Feeling much better now. :)

114256654697557 | 10223*2^4080101+1

114256654697557 = 2 ^ 2 x 3 ^ 2 x 7 x 1097 x 17551 x 23549 + 1

garo
06-27-2003, 04:58 PM
I think we need to use 1.0 and low bounds from now on otherwise prp will catch up pretty soon. We need to be doing 500 or so P-1 tests everyday. We need more machines!!

garo
06-27-2003, 05:35 PM
As a followup, I rustled up a quick spreadsheet to tell us what will be the rate of finding factors for given "factor values" - factor value is the argument to sbfactor that says how valuable the factor will be. Here are the results with the columns
factor value, B1, B2, chance of finding factor, number of squarings, time on a P4 2533 with 300 MB memory in seconds and average factor yield per day.



1.0 10000 82500 0.005837 22439 812.13 0.620977913
1.1 10000 87500 0.005940 22878 828.02 0.61980967
1.2 15000 138750 0.008419 33983 1229.94 0.591410121
1.3 15000 150000 0.008605 34935 1264.40 0.588003736
1.4 20000 200000 0.010668 45819 1658.32 0.555811284
1.5 20000 215000 0.010881 47059 1703.20 0.551970763
1.6 25000 262500 0.012631 57662 2086.96 0.522923298
1.7 30000 322500 0.014358 69238 2505.93 0.495038975
1.8 30000 345000 0.014613 71034 2570.93 0.49109224
1.9 35000 385000 0.015929 80947 2929.71 0.469761816
2.0 35000 420000 0.016285 83718 3030.00 0.464364356
2.1 40000 490000 0.017761 95948 3472.64 0.4418974


So in case we find that prp tests are outrunning us, ie P-1 is underpowered, it may make sense to bring the factor value down. I'm running my current batch of tests with fval=2.0 but may be better to lower it to 1.5 - if we end up being underpowered.

garo
06-27-2003, 07:40 PM
Some more excel work. I tried to compute the average time it will take to complete a test with the preceding probabilities and times to test. Of course, sbfactor is now picking optimal bounds and given that two tests are saved it seems that factor value = 2 works best. Nevertheless I plotted the time to completion with p-1 factoring for various values of time it takes to do prp if a factor is not found. In this case, I do not really consider single or doublechecking and all I am concerned with is the amount of computation time a factor saves.

Columns, avg. number of hours to clear a number.


1 10000 82500 39.99211167 49.93374167 99.64189167 149.3500417
1.1 10000 87500 39.99240556 49.93300556 99.63600556 149.3390056
1.2 15000 138750 40.00489 49.9207 99.49975 149.0788
1.3 15000 150000 40.00702222 49.92097222 99.49072222 149.0604722
1.4 20000 200000 40.03392444 49.92724444 99.39384444 148.8604444
1.5 20000 215000 40.03787111 49.92906111 99.38501111 148.8409611
1.6 25000 262500 40.07447111 49.94816111 99.31661111 148.6850611
1.7 30000 322500 40.12177167 49.97819167 99.26029167 148.5423917
1.8 30000 345000 40.12962722 49.98349722 99.25284722 148.5221972
1.9 35000 385000 40.17664833 50.01735833 99.22090833 148.4244583
2 35000 420000 40.19026667 50.02741667 99.21316667 148.3989167
2.1 40000 490000 40.25418222 50.07657222 99.18852222 148.3004722


If anyone could tell me how long a P4 2533 takes to do a test it would makes these results a bit more concrete.

Also, these results show that in the long run, i.e. taking doublechecking into account, it makes sense to set the factor value to 2. But if prp testing threatens to overtake p-1 then lowering the factor value should not have too detrimental an effect on the overall throughput of the project.

prime95
06-27-2003, 09:21 PM
Originally posted by jjjjL
It is using v23.4 objects but the main code in ecm.c isn't updated. I'll update it.

The optimal bounds picker changed too.

I remembered what the P4 optimal bounds bug was - the routine addr_offset in gwnum.c used uninitialized global variable FFTLEN rather than the local variable fftlen.

garo
06-29-2003, 07:39 PM
I have switched to a 1.2 factor value for my current range as proth tests are catching up with the P-1 effort pretty fast.

Mystwalker
06-30-2003, 05:46 AM
Got an Access Violation (0xc0000005) error message while trying to start the factorer on a P4 machine. :cry: The program crashes right after the CPU detection.

I'm pretty sure the address won't get you any advantage, but I post it anyway ;): 41f7b4

garo
06-30-2003, 05:53 PM
488399926097359 | 4847*2^4026567+1
1863592358663911 | 10223*2^4029221+1
51314675885537 | 19249*2^4026722+1


Hi Louie,
I forgot to log on as I submitted these three factors today.I resubmitted after logging on but just wanted to let you know so the stats show up eventually.
Thanks.

Mystwalker
06-30-2003, 07:22 PM
While we are at factor assignments:

I forgot to login prior to submitting today, too.
Luckily, it's only one factor - something around 27.837T

Yes, < 28 points, but I need every one I can get to eventually beat priwo... ;)

jjjjL
06-30-2003, 09:09 PM
Originally posted by Mystwalker
Got an Access Violation (0xc0000005) error message while trying to start the factorer on a P4 machine. :cry: The program crashes right after the CPU detection.

I'm pretty sure the address won't get you any advantage, but I post it anyway ;): 41f7b4

Linux I assume?

Mystwalker
07-01-2003, 04:56 AM
Nope, it's the windows version.

Is there a Linux version of the factorer available after all?

jjjjL
07-01-2003, 05:52 AM
Originally posted by Mystwalker
Nope, it's the windows version.

Is there a Linux version of the factorer available after all?

Must be poorly aligned code. That's no good. The linux code had to be realigned to make it not crash on P4s. I can't say I tested it on windows P4 systems but then again, I've heard other say that it worked so I thought it was fine.

The linux version of SBFactor is in the same zip file as the windows version. the "sbfactor" and "run.sh" files should do the trick. I'm runing that version on a few P4s right now.

-Louie

Mystwalker
07-01-2003, 09:30 AM
The linux version of SBFactor is in the same zip file as the windows version. the "sbfactor" and "run.sh" files should do the trick.

Ah, I remember. Ok.

btw. on another P4 machine, it _does_ run. :confused:

jimmy
07-01-2003, 11:15 AM
I have just found
148603027109277394117 | 67607*2^4090091+1
with sbfactor but the submission script doesn't accept it (because p>2^64 ?). What should I do?

Mystwalker
07-01-2003, 11:30 AM
Try http://www.seventeenorbust.com/largesieve/

mklasson
07-01-2003, 11:40 AM
what a monster :) congratulations

jjjjL
07-02-2003, 05:37 AM
:) SBFactor v1.1 :)

http://www-personal.engin.umich.edu/~lhelm/sbfactor11.zip

New Features:
-re-sorts tests by n instead of by k and then n
-prints k/n values before testing
-allows intelligent range splitting
-run.sh now same factor value (1.5)
-newer results.txt file

the first two are self-explanitory. the re-sorting by n makes sense so you can more accurately guess where you are and break up ranges after starting them. the printing of k/n values was asked for by mklasson.

the range splitting is a new feature i added for my own purposes but i made it a documented feature since others may want it. when doing a range of n values, you can just add two values before the amount of ram to signify the computer # and the total number of computers splitting the range.

so to split a process like

./sbfactor 4200000 4205000 45 1.5 256

across 3 computers, you'd use the commands

./sbfactor 4200000 4205000 45 1.5 1 3 256
./sbfactor 4200000 4205000 45 1.5 2 3 256
./sbfactor 4200000 4205000 45 1.5 3 3 256

the reason i say the splitting is "intelligent" is because it doesn't just break the range into 3 even chunks of width range/3. instead, it starts each computer at a different offset in the list of all values to test and steps by the total number of computers. this means that not only is the range processed from lowest to highest across all computers but each computer will get as close to an even amount of work as possible even if the values aren't evenly distributed by n.

right now, I'm using the splitting feature to run sbfactor on multiple computers remotely with my little awk script i created:

hostinfo -ALL -LINUX | awk '/0.00/ {i=1;while(match(substr($1,i++,1),"[a-z0-9]"));tmp = "ssh -f " substr($1,1,i-2) " ./sbfactor 4041200 4045000 45 1.5 " 1+comps++ " xXx 256" tmp} END {gsub(/xXx/, comps "\n", tmp);print tmp}' | bash

i don't think many people can use that but hopefully the new version helps those who are doing a lot of factoring. if you're happy with v1.0, there's no real need to upgrade. the next version will have manual bound setting for ranges and will use improved P-1 code.

-Louie

mklasson
07-02-2003, 11:27 AM
Very nice, Louie.

Just to make sure: the mod operation does use something clever like Gallot's idea, right? I'm assuming it does, but better safe than sorry. :)

Oh, and I found a really, really smooth factor last night:
40315798264717 | 21181*2^4031084+1
2^2*3^3*13*211*367*601*617+1
If only we could use B1=601,B2=617 all the time... :)

garo
07-02-2003, 12:42 PM
Hi Mike and Louie,
The factors I submitted without logging in have not been credited yet. Any ETA on this?
thanks

Keroberts1
07-03-2003, 02:23 AM
Question:

If a factor is found buy sieving ias it still checked for the P-1 factoring if it was found after the dat file for P-1 was made.

I ask because i just found
40112001560353 | 10223*2^4322789+1

to be tested very soon

mklasson
07-03-2003, 08:26 AM
Another question for you Louie:

don't factors > 2^64 get included in results.txt?

MikeH
07-03-2003, 12:17 PM
Hi Mike and Louie,
The factors I submitted without logging in have not been credited yet. Any ETA on this?
thanks, In the new scoring I'll pick up the P-1 co-ordination thread so that I can identify any factors that were submitted without being logged in. This has always worked well for sieving, and looking at the typcal size of the P-1 factors, there shouldn't be confussion. Sorry all, I haven't had a lot of time recently to look at the scoring, but should have some free time soon.

ceselb
07-03-2003, 03:23 PM
Shortly after I reserved 4055000 4060000 I noticed that the main effort already has catched up to that number.
I'm guessing there's not much use to do that range now, or will the server stop a test while in progress (if partial blocks are reported)?

larsivi
07-03-2003, 04:00 PM
As far as I can see, it's only the 67607 range that has a correct "max n tested" value at the moment. The others have had values above 4 million several tests before I got my first test above 4 million. I'm not sure how many tests there are below 5 million, but I guess that there are a few (1100 above 4 million are pending at the moment).

My very uneducated guess is that most tests out there are still below 4010000-4020000.

garo
07-03-2003, 04:05 PM
I've been watching the progress for some days now and the "real" assigned max usually goes up by 8-10000 a day. With that in mind I think a safe guess is 4020000.

So do your range :)

Troodon
07-06-2003, 09:21 AM
I want to help with P-1 factoring, but I'm a bit lost. I've downloaded the last version of the factorer, and my computer has 128 MB of physical RAM. After choosing a range, what parameters should I enter?

ceselb
07-06-2003, 10:37 AM
I'd suggest something like
sbfactor #start #end 45 1.5 128

If you don't mind the lag, you might even try 256 at the end.

45 is sieve depth, (2^45 ~= 35T )
1.5 is how many prp tests you value one number to. Most people use 1.5 afaik.
128 is memory used (in the B2 stage I think).

Troodon
07-06-2003, 11:56 AM
I'll try, thanks! Does it save the progress?

MikeH
07-06-2003, 01:31 PM
It's good to see that the effects of the P-1 factoring are finally making their mark in the stats.

Over the last two weeks the 4M<n<5M band has averaged 15 candidates removed per day. Compare that with with the other 1M bands of 10-13, and we see that P-1 is generating a useful 2-3 factors per day in the very useful area.
:cheers:

Moo_the_cow
07-06-2003, 02:46 PM
Since this is an active thread, IMHO it should be "stickied" along with the P-1 coordination thread.

ceselb
07-06-2003, 04:21 PM
Threads are now sticky. I hesitated to make it myself, since Louie unstuck the doble checking coordination a while back.

Maybe we could even split the forum into two parts.

Troodon
07-06-2003, 04:45 PM
My computer it's doing a range. What should I do (secuence) to power off it without data loss? Thanks!

ceselb
07-06-2003, 04:58 PM
No real need to do anything special, it saves the progress every few minutes.
If you want to minimize the work lost, wait until the xxxxxxxxxx file updates and then ctrl-c.

Troodon
07-06-2003, 05:04 PM
And when I want to restart it? Should I enter all parameters again? Thanks!

ceselb
07-06-2003, 05:43 PM
Yes.
But you could create a bat file and place it in the startup folder. I don't do this myself, but it's mentioned in this thread one or two pages back iirc.

Troodon
07-07-2003, 01:30 PM
I've entered
sbfactor 4154000 4300000 45 1.5 128
and it has factored up to 27653*2^4154145+1. Now I want to change some parameters: the end of the range (to 4155000), the value (2.0) and memory (256). Could I enter
sbfactor 4154146 4155000 45 2.0 256
or will there be any problem? Thanks!

garo
07-07-2003, 02:21 PM
Troodon,
Yes it should be fine. However, since you are changing the factor value parameter to 2.0 the bounds will increase and some work will need to be redone for the numbers you have already done. It may be a better idea to stick with 1.5 for this range.

mklasson
07-07-2003, 02:41 PM
Originally posted by garo
Troodon,
Yes it should be fine. However, since you are changing the factor value parameter to 2.0 the bounds will increase and some work will need to be redone for the numbers you have already done.
That shouldn't be a problem, as he raised start_n as well. It might be better to use start_n = 4154145 though -- some other k could be untested for that n. With that startn_n some additional work would be done for 27653*2^4154145+1.

Keroberts1
07-07-2003, 11:01 PM
is the number of factors estimated in the cordination thread a dependable number because there definatly seems to be more getting found? It appears to be findany many more than the estimates say? Is P-1 more effective than we previously thought?

jjjjL
07-08-2003, 01:28 AM
the estimated # of factors is assuming that the number has been completely pre-sieved to 2^45. this would cause some underestimating since it's not taking into account the probablility of finding factors in the unsieved ranges below 2^45.

is the difference between estimates and actual statistically significant?

-Louie

Keroberts1
07-08-2003, 02:27 AM
only if you're trying to figure out the value of P-1 factoring and weighing where you should devote your resources

which brings up the isue of a resource allocation model. I know this has been tried before but could P-1 be intriduced? just a thought i doubt it would have muc hof an impact but i was wondering how the resource allocation effects the timeline for the discovery of the last prime, since that is truely the only discovery that makes this a complete success. I'm refering to the original resource allocation model created earlier but if a new one could be constructed to be more accurate that would be great too, I don't expect anyone to build a completely new model though because obviously that would be alot of work.

Troodon
07-08-2003, 11:58 AM
garo, mklasson

I've decided to redo the whole range, thanks!

Mystwalker
07-08-2003, 01:01 PM
When you still have the old files, it takes less time to compute the new bounds.

frmky
07-09-2003, 03:47 AM
Yeah! My first factor! 1918126591518655393 | 4847*2^4155831+1 where 1918126591518655393=2^5*3*7*137*191*541*7013*28751+1. I'll submit it to the database tomorrow.

Is anyone keeping the intermediate files? They seem useful only if someone wishes to later try a larger B1 or B2.

Greg

Mystwalker
07-09-2003, 03:56 PM
As it seems like we're factoring faster than Proth testing right now (at least it seems that way for me), maybe we should set the bound to a higher value?

Can someone approximate what bound factor would be best?

btw. version 1.1 did work on the P4 system that crashed with the old version. :thumbs:

prime95
07-11-2003, 03:56 PM
Originally posted by Troodon
I want to help with P-1 factoring, but I'm a bit lost. I've downloaded the last version of the factorer, and my computer has 128 MB of physical RAM. After choosing a range, what parameters should I enter?

If you have 128MB of physical memory, then you shouldn't tell the program to use more than 96MB or so (assumes OS and various programs need 32MB). If you tell the program to use too much memory, you will thrash in stage 2 and run very slowly.

jjjjL
07-14-2003, 02:53 PM
i have compiled a couple new copies of SBFactor.

one uses the newer GIMPS code and should be faster. However, tests seem to indicate that it is slower (??). i'll post it in a zip file for brave souls to try out. i wouldn't use it for regular testing until it's a bit more polished, but if you want to speed test it you can have it. i also didn't strip all the normal GIMPS functionality out of it so it reads standard ini files that the Prime95 client uses to set it's memory (Day and Night) along with other settings. it also writes to factors.txt and it writes messges to the log even when no primes are found which i know someone wanted. anyway, it's called ecm.exe in the zip and use local.ini to set memlevel usage (not on the command line anymore).

the other new version (sbfactor.exe) has a special feature that lets you overwrite the internal sieve data. this lets you do quirky things like quickly check that factors will be found by only using the factors you know you need. for instance, all i do is write a file called 'badsieve.txt' and put this in it:

2
7
23
29
53
181
2699
6473

then i run sbfactor with the command

sbfactor 55459 4132918 7000 8000 256
SBFactor v1.1dev
P-1 and ECM factoring for number of the form k*2^n+1.
Adapted from GIMPS v23.4 by George Woltman and Louis Helm
AMD Athlon(tm) processor detected.
256MB of memory avilable for stage 2
Reading false sieve data from badsieve.txt
P-1 on 55459*2^4132918+1 with B1=7000, B2=8000
initializing test
sieve finished
55459*2^4132918+1 stage 1 complete. 170 transforms. Time: 32 seconds
Starting stage 1 GCD - please be patient.
1565008878285119 | 55459*2^4132918+1
Total factoring Time: 3 minutes

And that was while I was running two other background processes so assuming you know all the factors of P-1, you can get a factor a minute. ;)

in reality this is probably only useful for testing and doing absurd things like "finding" 40-digit factors in under a minute.

the assumption with the B1/B2 levels is that the lower the prime, the more likely it is to be a factor of P-1. that's why you'd never what to add primes to the factorization in any order but from lowest to highest. that's also why B2 starts at the place you leave off with B1. i agree that this is normally a safe assumption, but if you know better, feel free to prove us inside-the-box thinkers wrong.

if someone found a way to prove that certain factors could not be part of P-1 or that certain factor sets were mutally exclusive, it may be possible to improve efficency. btw, the sbfactor.exe is still "old code" unlike ecm.exe which is "new code". use sbfactor.exe to test out the badsieve features and ecm.exe to test out the new code's speed, extra ini features (look at GIMPS ini files for ideas), and the new bound setter.

feel free to post whatever you discover using these but at the same time, cavet emptor. i know for a fact these aren't well documented, easy to use, or feature rich. don't bother requesting features, pointing out shortcomings, etc for these versions... use v1.1 if you want something stable. these dev versions are for testing only. anyway, i'm going to be out of town for a week and couldn't update them even if i wanted to. good luck!

-Louie

http://www-personal.engin.umich.edu/~lhelm/sbfactor11-15dev.zip

Samidoost
07-20-2003, 10:20 AM
Hi

I am using sbfactor for prothsearch. It is really marvelous.
Many thanks to everyone involved in its developement.

But I have encountered a small problem with B1.

For detecting the prime
27711198115583 = 2 * 47^2 *229 * 619 *44249 + 1
which is a factor of 15*2^270458+1,
B1=619 must be enough.
But in practice it does not work with any limit less than 2251.

Regards,
-Payam

Samidoost
07-21-2003, 07:44 AM
Contrary to my previouse post
which was essentially a bug report,
here is some bonus:

B1=10000
B2=100000
539336741813170023987503 | 15*2^4000+1
p-1 = 2 * 7 * 17 * 17599 * 42406229 * 3036444899

B1=100000
B2=1000000
1259443246573 | 15*2^2035+1
p-1 = 2^2 * 3^2 * 7 * 29 * 53 * 3251653

B1=100000
B2=10000000
163718110671719644337719 | 15*2^374+1
p-1 = 2 * 3 * 191 * 142860480516334768183

Note that the last factor(s!) are greater than B2.

An explanation for both bug and bonus is needed.
:tempted:
- Payam

mklasson
07-21-2003, 08:21 AM
Samidoost:
539336741813170023987503 = 836228323141 * 644963494883
836228323141 = 2^2*3*5*7*23*8443*10253+1
644963494883 = 2*7*1097*1811*23189+1
so both factors are found at the same time with your bounds.

With that said, factors of p-1 that are bigger than the bounds can apparently sometimes be found when using "Brent-Suyama's extension" (which I don't really know anything about...). That's probably what happened in your second case. Your third factor is again composite, but with one of the p-1 having a bigger factor.

EDIT: oh, btw, regarding your earlier post about the B1 bound needing to be unnecessarily high: I figure it has to do with the fact that your p-1 has two 47 factors. You'll need a B1 bound >= 47^2 = 2209 then.

Mikael

smh
07-21-2003, 08:25 AM
An explanation for both bug and bonus is needed.

I can't tell anything about the bug, but the bonus is probably due to the Brent-Suyama extension, aka Suyama's powers.

There is a post from Alexander Kruppa about this on the Mersenne mailing list (http://www.mail-archive.com/mersenne@base.com/msg07584.html).

Sander

Samidoost
07-22-2003, 02:56 AM
mklasson, smh
Thank you for your useful comments.

You have almost fixed both of the problems.


regarding your earlier post about the B1 bound needing to be unnecessarily high: I figure it has to do with the fact that your p-1 has two 47 factors. You'll need a B1 bound >= 47^2 = 2209 then.
There are still primes 2213, 2221, 2237, 2239, 2243, 2251 between 47^2 and my practical B1 limit 2251.
It apears that B1=48*47 is a better choice.

- Payam

:|ot|: Just compare B1=48*47 with k=4847:D

Troodon
07-22-2003, 05:29 PM
Two questions that have been discussed, but still are unclear for me:

- What is currently the best factor value? 1.5? I'm using 2.0, which takes almost as twice computing time than 1.5.

- I have tried several mem values over 256 Mb (300, 310, 330 Mb), but it always allocates ~252000 kB for stage 2. I think this is beacuse there are some kind of "chunks". Couldn't this be improved?

Thanks!

biwema
07-23-2003, 08:32 AM
A factor value of 1.5 seems quite good even if don't have too much statistical material to analyse.

At the moment it seems as if primality testing is approaching p-1 factoring. There is a huge gap of not even reserved candidates above 4188000 which is reached by the main stream testing in a few days.
Would it not be better to decrase the factor value a bit to prevent leaving un(p-1)ed candidates?

biwema

mklasson
07-23-2003, 01:52 PM
Yeah, it's probably better to lower the factor value a bit for now. We need more p-1 firepower! Desperately.

jjjjL
07-23-2003, 06:51 PM
yes, you will likely need to add more power to P-1. I had a lot of P4s working on it for awhile but I won't be able to use them for about a month so someone else will have to replace the 50GHz+ I was putting on it.

Everyone will have to drop to a factor level of 1 or add about 20 more computers at 1.5 to keep up.

also, if you want to see where you should be, the next test the server is going to distribute is always in http://www.seventeenorbust.com/sieve/next.txt It is only updated a few times a day and can sometimes get (temporarily) smaller due to expiring tests but it will give you a ballpark idea of where tests are being assigned so you know where to be to stay ahead of prp testing.

-Louie

Keroberts1
07-23-2003, 11:15 PM
I'm trying to figure out how to use this appication I can't get it to work. I downloaded it and installed it into the same folder as SB 1.10 but after clicking on it it just opens for a second but then immediatly closes. Is there something simple I'm forgeting to do, I'd like to help out here because obviously you need more processors.

Joe O
07-23-2003, 11:57 PM
Originally posted by Keroberts1
I'm trying to figure out how to use this appication I can't get it to work. I downloaded it and installed it into the same folder as SB 1.10 but after clicking on it it just opens for a second but then immediatly closes. Is there something simple I'm forgeting to do, I'd like to help out here because obviously you need more processors.
If you create a bat file with sobfactor.exe help as the only line and execute it you will get something similar to:


C:\#GMP-ECM50\sbfactor10>sbfactor11.exe help
SBFactor v1.1
P-1 and ECM factoring for number of the form k*2^n+1.
Adapted from GIMPS v23.4 by George Woltman and Louis Helm
Intel Pentium III or Pentium III Xeon processor detected.
Factors are printed and dumped to fact.txt
P-1 factoring a single number with set bounds:
C:\#GMP-E~1\SBFACT~3\SBFACT~2.EXE <k> <n> <B1> <B2> <mem>
P-1 factoring a single number with optimal bounds:
C:\#GMP-E~1\SBFACT~3\SBFACT~2.EXE <k> <n> <factor depth> <factor value> <mem>
P-1 factoring a range of numbers with optimal bounds:
C:\#GMP-E~1\SBFACT~3\SBFACT~2.EXE <n low> <n high> <factor depth> <factor value>
[[cpu #] [total cpus]] <mem>
ECM factoring a single number with set bounds and # of curves:
C:\#GMP-E~1\SBFACT~3\SBFACT~2.EXE <k> <n> <B1> <B2> <curves to run> <mem>

<factor depth>: how much the number is factored expressed as a power of 2
45 means the number has been factored to 2^45 = 35 trillion
<factor value>: how many prp tests a factor would be worth
values of 1.2 to 1.5 are recommended

presuming you are running in windows. If you want to reserve a range go to the coordination thread (http://www.free-dc.org/forum/showthread.php?s=&threadid=3239) and reserve a small range eg 4191000- 4191100 then try run.bat 4191000 4191100 to see what happens

You should get:



C:\#GMP-ECM50\sbfactor10>sbfactor11.exe 4191000 4191100 45 1.3 256
SBFactor v1.1
P-1 and ECM factoring for number of the form k*2^n+1.
Adapted from GIMPS v23.4 by George Woltman and Louis Helm
Intel Pentium III or Pentium III Xeon processor detected.
256MB of memory avilable for stage 2
Finished parsing SoB.dat
4 numbers between 4191000 =< n < 4191100
Searching for known factors in results.txt...Done.
Searching for known factors in lowresults.txt...Done.
Removed 0 numbers using the factor files
Testing 4 numbers between 4191000 =< n < 4191100
Reordering array by n value...Done.
Estimating for k=27653 n=4191009
Estimating for k=10223 n=4191017
Estimating for k=19249 n=4191026
Estimating for k=21181 n=4191092
Expected number of factors for entire range: 0.042070
B1=20000 B2=190000 Success=0.010517 Squarings=46121
P-1 on 27653*2^4191009+1 with B1=20000, B2=190000
initializing test
sieve finished<<this really means that it started
there will be status lines after this
when it's really done it will tell you how long it took
then it will go on to the next k n pair
27653*2^4191009+1 stage 1 is 0.867 complete.
27653*2^4191009+1 stage 1 is 1.735 complete.
27653*2^4191009+1 stage 1 is 2.602 complete.
27653*2^4191009+1 stage 1 is 3.470 complete.
27653*2^4191009+1 stage 1 is 4.337 complete.

then edit your reservation to read:
4191000 4191100 keroberts 4 0.042070 ? [reserved]
when you are done running the range post a line with the ? changed to the number of factors you have found and the [reserved] changed to [completed]

There will be a file 276534191009 created after 10 minutes and updated every 10 minutes after that. This will be used to restart from if you stop the run. It can also be used to rerun with other parameters i.e. change the 1.3 to 1.5 or even run an individual pair with B1 B2 of your choosing. If you do not plan to do any more runs with this K N pair, then you may delete this file.

If you find a factor, it will be written to the end of fact.txt (i.e. appended not replaced). Submit it at the usual location. (http://www.seventeenorbust.com/sieve/)
If the factor is greater than 2^64 (~18,446,744,073,709,600,0000), then submit it at the large sieve page. (http://www.seventeenorbust.com/largesieve/)
If the factor is greater than 2^128 (~340,282,366,920,938,000,000,000,000,000,000,000,000), then submit it to Louie via email.

Periodically, you will want to refresh your results.txt file. The latest version can be found here (http://www.seventeenorbust.com/sieve/results.txt.bz2) in BZIP2 format.

If you want to know what the next K N pair to be handed out is, look here. (http://www.seventeenorbust.com/sieve/next.txt)

Thanks to Mikael and JMBlazek for the additional ideas.

jmblazek
07-24-2003, 12:42 PM
Originally posted by Joe O
...then edit your reservation to read:
4191000 4191100 keroberts 4 0.042070 ? [reserved]
when you are done running the range post a line with the ? changed to the number of factors you have found and the [reserved] changed to [completed]

There will be a file 276534191009 created after 10 minutes and updated every 10 minutes after that. This will be used to restart from if you stop the run. It can also be used to rerun with other parameters i.e. change the 1.3 to 1.5 or even run an individual pair with B1 B2 of your choosing. [/B]

Thanks Joe O for the additional explaination...I needed it as well. I have a few additional questions:

- I have 20+ ########## files and more are added every couple of hours. After a full run is complete, is it ok to delete them?
- will fact.txt be overwritten or amended between runs?
- If prp testing catches up to my range, should I continue to process or stop and select a higher range?
- I'm currently processing at a 1.5 factor and 500 megs on an athlon 1800. If I increase the megs, will it go faster or should I just lower the factor?
- Do I need to be updating the results.txt, lowresults.txt, or SoB.dat files? If so, where do I get newer files?

Thanks for your help!

BTW... my ranges should read:

4182500 4183000 jmblazek 15 0.187007 1 [completed]
4186000 4188000 jmblazek 66 0.822831 ? [reserved]

mklasson
07-24-2003, 01:16 PM
jmblazek,
some answers:
>- I have 20+ ########## files and more are added every couple of hours. After a full run is complete, is it ok to delete them?

unless you're planning on doing more testing of the same numbers, but with higher bounds. You probably don't want to do this, so the answer is yes.

- will fact.txt be overwritten or amended between runs?

new factors are added to the end of fact.txt

- If prp testing catches up to my range, should I continue to process or stop and select a higher range?

if the prp test limit exceeds the n value you're testing it's probably better to skip some n until you're above the limit again. Best is probably to plan a bit ahead and lower the factor value so you can stay ahead. Also keep in mind that it's the time of your factor submission that counts. If you submit factors only after a full range is completed and the current prp limit at that time is in the middle of your n range then all factors below the limit will be worth significantly less.

- I'm currently processing at a 1.5 factor and 500 megs on an athlon 1800. If I increase the megs, will it go faster or should I just lower the factor?

It will only be very marginally faster if you give it more mem than that, and only in some cases (generally the bounds seem to get set higher when sbfactor is given more memory). Lower the factor instead, or do both if you've got memory to spare. :)

- Do I need to be updating the results.txt, lowresults.txt, or SoB.dat files? If so, where do I get newer files?

only results.txt need updating every now and then to skip candidates eliminated via sieving or some other method. You can find it at http://www.seventeenorbust.com/sieve/results.txt.bz2
It's bzip2 compressed.

Just my thoughts,
Mikael

jmblazek
07-24-2003, 08:27 PM
...
Just my thoughts,
Mikael
Thanks Mikael!

Does anyone know if the p-1 coordination thread will be cleaned up? Also, it'd be nice to add these links to the thread...for quick reference:

http://www.seventeenorbust.com/sieve/next.txt
http://www.seventeenorbust.com/sieve/results.txt.bz2

biwema
07-27-2003, 11:37 AM
by the way,

be careful that you don't reserve a range which is too near the prp effort.
If your factors leave the active range before you submit them, you will only score ~0.4 instead of 2100.

Or am I mistaken?

Another solution to this problem is to increase the size of the active window by 100000 or so below the prp border.

biwema

MikeH
07-27-2003, 12:02 PM
Or am I mistaken?

Another solution to this problem is to increase the size of the active window by 100000 or so below the prp border. You are not mistaken.

I don't really want to extend the window down, because anything that is submitted below the window has not saved a PRP test, and that's where the scoring is now biased.

Mike.

Nuri
07-27-2003, 06:13 PM
It's true that these factors will score 0.4 instead of 2100. But, they will eventually score 1260 (=2100*0.6) when (if?) double check PRP reaches that number. So, you still hold your stock options. Of course, the score will not change to 1260 if we find a prime for that k before double check PRP reaches that point.

biwema
07-27-2003, 08:14 PM
When doublechecking reaches that point, first time checking is probably beyond 10M scoring 12500 each factor. In that case these 1200 points are not that important anymore compared to the factors of 10M.

Actually, I can understand Mike not to lower the first time range in order to motivate people to factor exponentnt beyond the borderline.

biwema

Frodo42
07-31-2003, 08:14 AM
From the coordination thread

The best solution would be to "merge" P-1 factoring into the main (PRP) client.

Well stage 1 could perhaps be merged but stage 2 is memory demanding and therefore I don't think that would be a good idea.
I myself check factors that has not been P-1 factored before I let the client run them, but to implement stage 2 into the client would destroy one of it's main features, that it does only uses "free" resources.

smh
07-31-2003, 03:46 PM
Well stage 1 could perhaps be merged but stage 2 is memory demanding and therefore I don't think that would be a good idea.

This could be done the same way as the GIMPS client. Let de user specify the ammount of memory stage 2 can use. If there's not enough memory to run stage 2, just skip it and continue testing the number. The more memory is available, the higher second bound can be used.

It would be even better if daytime and nighttime settings would also be implemented.

jjjjL
08-01-2003, 11:45 AM
the P-1 will eventually be in the client, but it won't be in the short term (aka this year) for a number of reasons.

first off, the P-1 code is still in flux. i'd like the time estimation code to be better. i'd like to see people test the developement version I posted (with newer gw-code) that no one even tried.

second, i'd like to wait until there was actually a sizable amount of work for factoring. right now, the need for factoring is so small that i don't see why it can't be done outside the confines of the normal prp client.

third, it's ideal to only have P-1 done by those willing and able to use 256MB+ of mem. i realize that skipping stage 2 is ok when it has to be done, but it is where the majority of factors are found. if someone wants to go though the results.txt file and analyze the factors likely found with P-1, that would be interesting. but I know from when I was running a cluster of 20 P4 2.26GHz machines on it for a week, I only found 1 factor out of 10 with B1 factoring alone.

as another example of how important B2 is, look what happens if you tell the client you only have 24MB of mem for B2.

C:\sbfactor>sbfactor 4250000 4250100 45 1.0 24
SBFactor v1.1
P-1 and ECM factoring for number of the form k*2^n+1.
Adapted from GIMPS v23.4 by George Woltman and Louis
Intel(R) Pentium(R) III processor detected.
24MB of memory avilable for stage 2
Finished parsing SoB.dat
5 numbers between 4250000 =< n < 4250100
Searching for known factors in results.txt...Done.
Searching for known factors in lowresults.txt...Done.
Removed 0 numbers using the factor files
Testing 5 numbers between 4250000 =< n < 4250100
Reordering array by n value...Done.
Estimating for k=24737 n=4250023
Estimating for k=27653 n=4250049
Estimating for k=10223 n=4250057
Estimating for k=21181 n=4250060
Estimating for k=19249 n=4250066
Expected number of factors for entire range: 0.000000
B1=0 B2=0 Success=0.000000 Squarings=0
P-1 factoring doesn't make sense for this input.
B1=0 B2=0 Success=0.000000 Squarings=0
P-1 factoring doesn't make sense for this input.
B1=0 B2=0 Success=0.000000 Squarings=0
P-1 factoring doesn't make sense for this input.
B1=0 B2=0 Success=0.000000 Squarings=0
P-1 factoring doesn't make sense for this input.
B1=0 B2=0 Success=0.000000 Squarings=0
P-1 factoring doesn't make sense for this input.

The code determines that without sufficient memory, it's not even worth testing. That's how close we are to the boundry right now. Even the program itself recognizes that without B2, it's more efficient to just prp the number.

another major reason that P-1 can't be in the client immediately is because it would require a change in the protocol, the client, and the server. it may seem trivial on its face but i assure you it's not. the server code would be the most difficult and i'd rather wait until it was more nessisary to do it.

Even if there were no justification for P-1 to be outside the client, I still wouldn't put it in right now. That's because I am personally unavilable to code for the for the forseeable future:

August: In two days, I leave for California. I will then be in Florida for the second half of August. I will not touch computers after tomorrow.
September-December: Soul-crushing final semester @ UM.
Jan 2004: (Hopefully) start full-time job

That also means that I will be unable to finish the new linux client this summer. :( I'll try and finish it this fall, but other developement will probably wait till '04. Speaking of which, if someone wanted to manage the sieve/factor coordination threads, email me soon. Joe and ceselb seem to be busy and I will soon be gone.

-Louie

Troodon
08-01-2003, 08:43 PM
first off, the P-1 code is still in flux. i'd like the time estimation code to be better. i'd like to see people test the developement version I posted (with newer gw-code) that no one even tried.


i wouldn't use it for regular testing until it's a bit more polished, but if you want to speed test it you can have it.
Currently, I don't have neither the time neither the CPU power to do such tests.


second, i'd like to wait until there was actually a sizable amount of work for factoring. right now, the need for factoring is so small that i don't see why it can't be done outside the confines of the normal prp client.
If the need of factoring is so small, should I stop factoring? AFAIK, I'm the only who's factoring ranges ahead of PRP testing.

jjjjL
08-01-2003, 09:24 PM
Originally posted by Troodon
If the need of factoring is so small, should I stop factoring? AFAIK, I'm the only who's factoring ranges ahead of PRP testing.

If factoring is producing results, I'd say you should continue if you enjoy saving SB resources. It may be a small need but it takes a lot of manual work to do it. I appreciate all you do.

Perhaps a few others will follow your lead and join in the factoring work.

-Louie

Troodon
08-02-2003, 06:25 AM
Thanks Louie! At this moment, my poor Tualatin (now running@1350 MHz because my room ambient temp is 40ºC!) has factored 192 k/n pairs (many of them using 2.0 - now using 1.3 to gain speed) and 0 factors have been found. Maybe it's just I'm unlucky.

With 2.0, it needs ~145 minutes to factor a k/n pair. That equals to ~8 factors/day (I turn it off while I sleep). If we consider ~250 PRP test are finished per day, we would need 32 machines like the mine one to keep up factoring ahead of PRP testing!
With 1.0, it needs ~50 minutes per pair and we would need 12 Tualatins@1350 MHz.

Also, I've observed the program takes memory only in very lage chuncks - 128 Mb/chunck I think. Even if I enter 380 Mb it only takes 256 Mb.

Mystwalker
08-02-2003, 09:55 AM
I'll join factoring again once I have my new PC...

b2uc
08-06-2003, 06:24 PM
Queston:

I'm P-1 factoring the range 4297000 4298000 and I found my first P-1 factor EVAR! However..the factor is not new...and puzzles me as to why it wasn't excluded from the results.txt file.

3 | 21181*2^4297244+1

What's up with that?

mklasson
08-06-2003, 06:51 PM
b2uc:
3 does not divide that number. What's up with that? :)

Anyway, perhaps you have a fairly old results.txt and your factor (relatively small? somewhere around 40T?) was found by a siever just recently? I'm assuming you got the "not new"-message when trying to submit the factor.

Mikael

b2uc
08-06-2003, 07:08 PM
My computer was abruptly shut down about the time that fact.txt was written...should I P-1 that factor? Just to be safe?

Also...I'm using 1.11 or whatever version was the last before the test one...and just downloaded it yesterday...so hopefully the result.text file is up to date.

mklasson
08-06-2003, 07:27 PM
and that line with "3 |..." was written to fact.txt? Sounds weird. I could understand if the last part of the line was chopped off, but the first part? If something was written to fact.txt, it undoubtedly seems as if a factor was found... Yeah, doing a new p-1 on that pair sounds wise.

The results.txt in the v11 zip is just over a month old now, so it might be a good idea to get a new one.
http://www.seventeenorbust.com/sieve/results.txt.bz2

Mikael

Troodon
08-08-2003, 05:24 AM
Wow! It works! After factoring more than 260 k/n pairs, I've found 38372614033613527 | 22699*2^4252294+1 although it has been found too late!
:(

MikeH
08-08-2003, 09:46 AM
although it has been found too late! Troodon I really feel sorry for you, it's hard enough to find these factors, then to have missed the window.:cry:

Can I suggest that P-1 factorer's pick ranges that are a little further from the PRP 'wave'. In the scoring we have a 500K window, this represents about 2 months PRP work. So I would suggest picking a range that is at least 200K from the 'wave'.

As I write this, the next.txt file indicates 4.269M. This means the coordination thread is a little confusing, because it shows ranges that are now being PRPed as [available].

Can we please update the co-ordination thread to offer some advice to new P-1 factorer's like "before reserving a range, check the next.txt file, add 200K to the n value, then pick an available range above that point".

Mike.

MikeH
08-16-2003, 11:09 AM
Does anyone have any metrics on eaxctly how much difference memory allocated to P-1 makes? I know there's a post from Louie that shows that PCs with very small quantities of RAM just won't work, but has anyone experimented with hugh quanties of RAM?

When I've run P-1, I've allocated 256M, where the PC has 512M of physical RAM. Does anyone know how much difference it would make if I upgraded to (say) 2G? If it's really small like 2% gain, then it's clearly a waste of time, but if it's 10% now that's worth considering....:whistle:

Troodon
08-18-2003, 08:58 AM
Please note that when you increase the amount of RAM available to sbfactor, it also increases the optimal bounds!
Yesterday I did a little test with 4847*2^5000007+1, by entering sbfactor 5000002 5000010 45 1.0 [mem]

***********
With mem=32
***********
Success: 0.004617
B1=10000
B2=40000
Squarings=22062
Memory used (kB)=38000
Memory allocated (kB)=42000
Stage 1 transforms=28892
Stage 2 transforms=11958
Time used for stage 1 (s)=1617
Time used for stage 2 (s)=2549
Total factoring time:43 min

************
With mem=512
************
Success: 0.008419
B1=15000
B2=138750
Squarings=34402
Memory used (kB)=464708
Memory allocated (kB)=465400
Stage 1 transforms=21500
Stage 2 transforms=43254
Time used for stage 1 (s)=2420
Time used for stage 2 (s)=4077
Total factoring time:70 min


When I'll some time, I'll perform more tests, with the same bounds.

Troodon
08-24-2003, 03:01 PM
More benchmarks:
sbfactor 4847 5000007 100 150 [mem]

***********
With mem=32
***********
Memory used (kB)=36700 (1.1); 36704 (1.1 dev)
Memory allocated (kB)=37816 (1.1); 37856 (1.1 dev)
Stage 1 transforms=270
Stage 2 transforms=1230
Time used for stage 1 (s)=15
Time used for stage 2 (s)=158
Total factoring time=3 min

************
With mem=256
************
Memory used (kB)=67364 (1.1); 67368 (1.1 dev)
Memory allocated (kB)=68452 (1.1); 68492 (1.1 dev)
Stage 1 transforms=270
Stage 2 transforms=184
Time used for stage 1 (s)=15
Time used for stage 2 (s)=97
Total factoring time=2 min

As you can see:
- There is an improvement with more RAM, if the bounds are the same.
- sbfactor 1.1 dev is as fast as sbfactor 1.1 and takes a few more kb of memory.

MikeH
08-25-2003, 05:57 AM
************
With mem=512
************
Out of interest, how much physical RAM did this machine have?

Troodon
08-25-2003, 03:38 PM
640 Mb

Frodo42
09-04-2003, 04:54 AM
For some weird reason the factorer stops when it's done with Stage 1 of 28435^4575913 .
It's only with this one it happens, so I skipped it so far.
Could someone perhaps tjeck if it also happens on another box?

biwema
09-07-2003, 09:12 AM
Hi,

At the moment only a part of all the candidates are prp tested. So there is a lot of choice.

Does someone know, where the limits for FFT sizes (SSE" and not SSE2) are. In that case these few people who do prp tests can avoid taking ranges just a bit above these limits.
That micht increase the efficiency a bit.

just an idea

biwema

biwema
09-09-2003, 07:20 PM
Hi,

I tried to find now the FFT limit for SSE2 processors.

It is somewhere between 4780000 and 4781000.
at that point the b1 increases from 15000 to 20000 (1.1; 256M Ram) increasing the chance of finding a factor by 25%, but it takes amost twice as long to calculate.

biwema

Keroberts1
09-09-2003, 08:59 PM
Does that mean that very soon the time required to do a prp test will increase drastically?

biwema
09-10-2003, 04:16 AM
I don't know exactly, but I think so.

At 4650000 one test took me 50 minutes;

at 4820000 it is 1:40 and
at 4700000 - 4780000 it is 1:20

It seems as if there are two steps (one for fft and one for the increase of b1) but I actually don't know what weight to choose (between 1 and 2) that the throughput of factors ist best. I am usind 1.1.

I think I have to investigate a bit further...

we will see
biwema

ceselb
09-10-2003, 08:01 AM
I'm getting alost exactly one factor per day at 4830000 (1h 40m per test on a PIV 1.5).

Btw, anyone tried to do a range over 5M? Sbfactor just exits if I try :( :confused:

Mystwalker
09-10-2003, 08:17 AM
'm getting alost exactly one factor per day at 4830000 (1h 40m per test on a PIV 1.5).

Using a P4C @ 3 GHz, 2 tests at bounds=1.5 take 103 minutes. Getting only a factor every 2 days in average, though...


Does that mean that very soon the time required to do a prp test will increase drastically?

There will be a slowdown (slightly dropping cEM/s rate). It has happened some times before already...


Btw, anyone tried to do a range over 5M? Sbfactor just exits if I try

Seems like Troodon did:

5000000 5000010 Troodon 1 0.008419 0 [completed]

Frodo42
09-12-2003, 08:58 PM
I found a weird factor:
7 | 5359*2^4754206+1

I can't submit this one, because of the boundries laid on the submit site ...
Shouldn't this one be sieved pretty much all the way?
It's found using stage 1 of P-1-factoring.

mklasson
09-12-2003, 09:07 PM
That factor is incorrect. Bug in sbfactor?

Mystwalker
09-13-2003, 07:43 AM
AFAIR we had this bug once before. It did turn out to be only the last digit of the found factor, didn't it?

Frodo42
09-13-2003, 03:14 PM
AFAIR we had this bug once before. It did turn out to be only the last digit of the found factor, didn't it?

Well I thought about that possibility. I'll let the client run through the test for that number from the beginning again when I return to the computer.

Frodo42
09-15-2003, 12:57 AM
:swear: I can't make the P-1-factorer give any output but 7 | 5359*2^4754206+1 for this test. Probably only the last digit, but how do I get my hands on the rest of the digits :trash:

Joe O
09-17-2003, 12:22 AM
Originally posted by Frodo42
I can't make the P-1-factorer give any output but 7 | 5359*2^4754206+1 for this test. Probably only the last digit, but how do I get my hands on the rest of the digits
You might have a prime!
I would suggest running PRP or Primo on it just to be sure.

Keroberts1
09-17-2003, 01:29 AM
is there a particular reason why that would indicate a prime?

Joe O
09-17-2003, 02:05 PM
I've been debugging programs for nearly 41 years. This is the type of "bug" that I've seen many a time. "We are only looking at composite numbers to factor, we'll never not have 2 factors." Or it could just be a very large factor, larger than the program expects to discover.
If it is indeed the last number of the "factor", the fact that it is a 7 certainly does not rule out the possibility that it is a prime. If you recall, wblipp has shown that all of the remaining primes will end in 7. (http://www.free-dc.org/forum/showthread.php?s=&threadid=2994)

biwema
09-17-2003, 06:45 PM
Frodo42,

Did you try rediscovering your factor with different bounds?
What would happen if you used bounds that are too small to find the unknown factor?

What happens, if you continue using the safefile of one of those tests?

Nevertheless I wish you good luck

Frodo42
09-29-2003, 02:18 AM
Did you try rediscovering your factor with different bounds?
What would happen if you used bounds that are too small to find the unknown factor?
What happens, if you continue using the safefile of one of those tests?

I tried with lots of difefrent bounds and redoing it different ways.
With weight smaller than 1.4 (can't remember the specific bounds) it just does not find anything and goes on to stage 2 where it also doesn't find anything.
I also tried with lower bounds and then raising them on the next run, that still gives me that blasted 7 ...

Anyone who can find the right factor is welcome to do so and take the credit in the stats. I've given up and use my processor power for continued P-1 factoring instead.

Joe O
09-29-2003, 11:09 PM
Well we know it is not prime.

5359*2^4754206+1 has no small factor.
5359*2^4754206+1 is composite: [2CC06F27C7AF165D] (695868.8883s+0.0115s)

Nuri
10-14-2003, 04:21 PM
I want to run the client on my P4, but unfortunately I can not.

I don't know why, but the client exits when I try to factor a range. To be more precise, it either exits (for ranges above 4983000, if a I recall correctly), or gives output like


Expected number of factors for entire range: 0.089770
B1=20000 B2=200000 Success=0.008977 Squarings=45863
P-1 on 33661*2^4495032+1 with B1=20000, B2=200000
initializing test
sieve finished
33661*2^4495032+1 stage 1 complete. 188 transforms. Time: 21 seconds
Starting stage 1 GCD - please be patient.
33661*2^4495032+1 stage 2 complete. 226 transforms. Time: 301 seconds
Starting stage 2 GCD - please be patient.
finished.
Total factoring Time: 10 minutesfor ranges below 4983000 (I also doubt if this output is ok. As I recall, it took longer, and gave % completion info as well.)


PS:
I've tried various versions of the client,
I've read this thread a couple of times to find out what I'm doing wrong,
I've was able to run it in the beginning (4080000 4081000 Nuri, client version was 0.6 I guess).
I guess this is because I am skiping some step somewhere, but I can not recognise where. :dunno:


Any comments is welcome.