PDA

View Full Version : Sieving



smh
12-02-2002, 06:51 PM
Louie,

How far are the current numbers sieved, and how far do you plan to go?

I have one or two machines available for sieving (they can not connect to the SoB server) so if you need any help, just let me know.

Samidoost
12-03-2002, 12:11 PM
Hi
Here is all the information which I have about sieves:

k=33661 is sieved by Louie up to 1000G.

Almost all of the k values are sieved by approx. 80 G by Phil Carmody and approx. up to 200G by Joseph Arthur.

I have attached the final result of k=4847 sieve. It is sieved up to 96G by Phil and then up to 208G by Joseph.

You can continue it using Paul Jobling's NewPGen and then report the result to Louie to use it.

Payam:)

smh
12-03-2002, 01:17 PM
I don't mind taking it a bit further, but only after i know that i'm not duplicating any work.

Also, i have no idea how far it needs to be tested and how long it takes, but i guess we have a couple of weeks before we start testing this K above 2M

<edit> I'll let it run overnight to see how fast things go. But i would like to know if anyone else hase done more work on this K

jjjjL
12-03-2002, 09:41 PM
if you do continue sieving, make sure to save the composites to a delete file as it will make it easier for me to use my current removal scripts then. thanks.

-Louie

Samidoost
12-03-2002, 11:30 PM
Originally posted by smh
I don't mind taking it a bit further, but only after i know that i'm not duplicating any work.


Be sure
This is the first time that the sieve results for this k is published.

Payam

smh
12-04-2002, 05:17 AM
I'm only sieving N's above 2M

I think my PIII450 does 5 to 6 billion a day. I might have some cpu time on other machines during the night to speed things up a little. And if we are getting close to assigning this K i can do some sieving on a P4

I have no idea how much sieving needs to be done, but if someone is willing to help, please contact me so we can avoid duplicating work.

Hawk
12-04-2002, 11:12 PM
How large an n have we sieved to, I thought Phil "only" did up to n = 3 million, which could start causing us problems relatively soon, especially if we keep getting new people. Do we need to start a large scale sieving effort on n > 3 million?

smh
12-05-2002, 04:11 AM
I think it will take quite a while before we run out of exponents below N=3M. Don't forget that the larger N's tale much longer to test.

OTOH, i doesn't hurt to have at least some candidates available above 3M, since we have to sieve the large exponents pretty deep. Sieving 3 to 4M up to 100G will remove enough numbers to at least have some exponents available when we need them.

Maybe someone can build a simple website to coordinate this?

I'm currentely sieving K=4847 (2-3M) in a few ranges between 200 and 300G. Although my sample is way to small to draw conclusions about how far to sieve, currenetly i'm removing candidates between 10 and 15 times faster then i can test them with the SoB client

smh
12-10-2002, 02:26 PM
Just to give a little status update.

I just merged the sieving files from different computers.

Above N = 2.000.000 there were 4220 N's left after sieving to 208G. (Including the ones that already had a prp test)

I sieved from 208G to 280G and removed 37 N's.

This is mainly done by a PIII 450 with a little help from a PII400 (last weekend) and a few hours on my home P4.

If i make a very rough estimate, runtime took about 12 P3 days, so i removed a bit over 3 numbers a day, which is IIRC more then 10 times faster compared to prp testing numbers of this size.

(i had 1 day which removed only one number, and twice removed two N's within a few minutes)

I'll let it run for a while, The P3 makes a slow, but steady progress of about 6G a day. Help is always welcome of course ;)

shifted
12-10-2002, 03:22 PM
Can I compile that seiving program run on Solaris? Or does it have x86 instruction codes...

nuutti
12-10-2002, 03:59 PM
I guess that smh is using this program :
http://www.utm.edu/research/primes/programs/NewPGen/
But it should not be a big task to write one general one (ansi C).
I am not able to do that but problem is a lot easier that doing
seventeenorbust client.

Yours,

Nuutti

philmoore
12-10-2002, 04:57 PM
I've just started to think about this, but it seems to me that it might be more efficient to sieve all 14 k values at the same time. Sieving on n is equivalent to determining, for each prime p, whether k*2^n+1==0 mod p, or what is the same thing, 2^n==-1/k mod p. So checking divisibility by p can be done by computing 2^n mod p and seeing if this is equal to any of the 14 values of -1/k mod p. I really don't know alot about sieving, but my guess is that Paul Jobling and/or Jim Fougeron could tell us if this is feasible. What got me thinking about it is that smh reported eliminating exponents 10 times faster than he could Prp them, and if I understood smh correctly, that was on just one k value. That says to me that more resources on sieving are needed, and this will become especially important as the project tests larger and larger n values. Sieving is what gives this project an advantage over GIMPS when it comes to looking for large primes. No two Mersenne numbers with prime number exponents can have a common factor, so trial factoring is a more effective method of removing composite Mersenne candidates. However, GIMPS still ends up running primality tests on 40% of their original candidates, my guess is that SoB ends up actually testing a much smaller percentage. (Does anyone have any statistics on this?)

Phil

nuutti
12-10-2002, 05:10 PM
you can mail to one of the following mailing list and ask about that :

http://groups.yahoo.com/group/openpfgw/
http://groups.yahoo.com/group/primeform/
http://groups.yahoo.com/group/primenumbers/

or search these mailing list achieves using key word sieving/sieve.

Phil Carmody have sieving pages here :
http://fatphil.org/maths/
http://fatphil.org/maths/sierpinski/
and even one source code for factorial sieve :
http://fatphil.org/maths/factorial/factsv.c

But I suggest you to contact via email to Phil if you are going to use his data / code.

Yours,

Nuutti

smh
12-10-2002, 05:41 PM
Yes i'm using newpgen. As far as i know it's the only program that can handle the fixed K sieve. If searching for a large proth prime, it's more efficient to sieve on fixed N numbers. In that case, twingen is also available, which has SSE2 instrucksions so it should be much faster on the P4. As far as i know, the next version of Newpgen will have SSE2 instructions too.

How many numbers are left after seaving greatly depends on the weight of the number. All the remaining candidates have a rather low weight (a lot of N's are removed by small primes).

For K=4847 which i'm working on (on a machine which can't connect to the SoB server) there are 500000 odd values between 2 and 3M. Sieving to 208G removed 495780 numbers to test (this is where i continued Payam's file).

When i merged the files (coz it was getting messy) the sieving was at 280G, which is about 35% higher, and this eliminated 37 numbers (about 0,9%). On the primenumbers list i read that sieving twice as far eliminates about 3% more numbers. Which is about in line with my small sample.

So when sieving to 1T there will be almost 4000 numbesr left to test, which is about 0,8% of all numbers.

The higher N gets, the longer it takes to do a prp test, so it makes sence to sieve further, so the percentage should only go down as the numbers get larger

philmoore
12-10-2002, 07:01 PM
Thanks, smh, for the interesting data. I was curious how it compares with GIMPS rate of eliminating primality test candidates. In that same range from n=2,000,000 to 3,000,000, there are 67,883 prime values on n (composite values automatically make 2^n-1 composite) and of that number, I'm guessing from the GIMPS status page that trial factoring eliminated about 57% and the remaining 43% had to be tested, about 29,000 tests. So 93.2% of the exponents are eliminated for free, because they are composite, and of the remaining 6.8%, only about 2.9% of the original number are left after trial factoring. For your case of k=4897, the sieving done so far gets you down to about 4200 numbers left to test, or 0.42% of the original number. (You mentioned getting it down to 0.8%, but this really should be considered 0.4% since the reason you only sieve on odd values of n is that 4897*2^n+1 is divisible by 3 when n is even. For k=5359, we sieve instead on even values of n since 5359*2^n+1 is divisible by 3 when n is odd. By only working with even/odd values, we can reduce the amount of storage needed for the sieve by half.)

So that is good, 0.4% is a lot less than 2.9%, except that we are trying to simultaneously search all 14 k-values! When I take into account the Proth values for all 14 k-values, I estimate closer to 52,000 numbers to test for primality in the range n=2,000,000 to 3,000,000. Which translates into more tests than GIMPS, not less. Of course, further sieving will continue to reduce this number, but eliminating a couple more k-values would reduce it even more!

smh
12-11-2002, 03:35 AM
Yes you are right about the even/odd values, i got mixed up with a fixed N sieve.

Looks like you've got what you asked for. Another prime :)

Another way to speed up the sieving is searching for a larger range of N's. The cost of sieving is the square root of N, so it's faster to do one search for 2-3M instead of 10 ranges of a 100K

I'm still wondering how far the other numbers are sieved. According to Phils page, a lot more sieving needed to be done.

Sander

Bellbox
12-12-2002, 01:45 AM
I'd be happy to help with the sieving, I've got a P4 2.2 that I could throw on it. I have newpgen, just tell me what you want me to do.

philmoore
12-12-2002, 12:55 PM
I just received an email from Paul Jobling that he asked me to forward to this forum: (My questions to him are marked by >.)

> My question is, would there be any advantage to
> sieving all 13 of these candidates simultaneously?

I have just had a look at the code, and there would be a saving over performing the sieving all on one machine. There would even be an advantage over sieving on 13 machines, as if you had 13 machines you could give each of them a range of p to test, and then combine the results at the end (though that does mean that you are committed to leaving each of them to run until they complete their range).

I can definitely see an advantage in sieving all 13 candidates together - indeed, I can see that modify NewPGen specifically for this task would not be too hard, assuming, of course, that it would be used.

> My understanding of NewPGen (from Phil Carmody's web-page) is
> that for large values of p, the program does a "giant steps,
> baby steps" solution of the discrete logarithm problem.

That is correct.

> How do you know whether the equation 2^n==-1/k mod p even has
> a solution in n? (Phil Carmody apparently refers to an
> observation of Payam Samidoost that allows one to identify
> such non-soluble congruences.)

That is right. Payam observed that for odd n, legendre(-2k/p) must be 1 for p
to divide any n. For even n they must satisfy (-k/p)=1.

> Perhaps the project is in need of some sort of distributed sieving client that lets
> different sievers work on different p ranges simultaneously.

I would have no problem with modifying my code to perform this sort of sieving. The SoB guys are getting some good results, and anything that can be done to help them would be useful.

> Thanks for any help you can give!

No problem.

Regards,

Paul.

------

And then a second email:

Phil,

By the way, I just went along to the SoB forum and saw the thread on sieving. I wanted to post a reply, but I cannot yet - do you think that you could do that for me? It looks like (a) the sieving needs to be coordinated, and (b) a
dedicated SoBSieve program would improve performance. I could probably get it written in about 6 hours.

Regards,

Paul.

Vato
12-12-2002, 01:00 PM
What a nice, pleasant and helpful chap!

jjjjL
12-12-2002, 02:18 PM
philmoore/paul - a 13k sieve would definately be cool. if Paul could do that, I would take advatange of it for sure.

we used to use a custom sieve Dave and I wrote but it wasn't as efficient as numbers got higher so I started using NewPGen. just to give you an idea, this is the way sieving is done now:

the sieving is carried out on a (MPI) cluster of linux computers run by the university. specifically they are dual P3s. I run 2 k values on each one.

i used the split function on k=33661 to sieve it across 20 systems for like a week and was able to quickly sieve it up to 1T. however, it's a lot of work to do all the splitting and script writing to submit jobs to the cluster that way so i'm only running one process on each k now.

i don't know how feasible it is, but if it were possible, it would be nice to have NewPGen spawn multiple threads. the current split system while good, is probably not as optimal as it could be especially since re-merging and spliting them out occasionally to remove n values from each file is a lot of work... nothing i can automate. in other words, i don't do it.

if i could sieve all k values at once and have NewPGen automatically spawn 20-40 threads at a time, that would be great. the cluster system uses MPI/PVM but as long as the threads can be spawned with a little delay between each other, i think regular threading will load balance itself correctly across the cluster. if paul were interested in attempting this, I would be happy to work with him to test it out or help with the programming.

-Louie

nuutti
12-12-2002, 02:24 PM
Here are few thoughts that came my mind about this issue :

Quite a lot depens on Louis and Dave. Louis mentioned once that he has some kind of cluster sieving and removing candidates.
I would like to hear more about current sieving situation.

Best solution would be that we could have pc sieving software and 64-bit generic sieving software that could be compailed in several 64-bit platforms. I guess that Pauls software is based on highly optimized pc assembler. on the other hand 64-bit sieving should be faster than 32-bit sieving after 4.3G=2^32 (I guess).

N limit in sieving should be quite high because we will reach n=3,000,000 quite soon. We will be ten or bust quite soon also.

If we are going to use Pauls software then he should get credits when we will find a prime.


Yours,

Nuutti

shifted
12-12-2002, 04:37 PM
Yes, 64-bits would be cool. I have access to a small lab of Solaris machines that could run it at night.

smh
12-12-2002, 05:20 PM
Originally posted by Bellbox
I'd be happy to help with the sieving, I've got a P4 2.2 that I could throw on it. I have newpgen, just tell me what you want me to do.

You can take payams file (attached in the first post) as input file and start sieving form 350 or 400*10^9 (just change the first line of the input file).

You can remove all numbers between 1M and 2M from the file since it's already known that they are all composite. My save file (which has another 40+ numbers removed) is in the office, so i'm not able to retrieve it until early next week. But the 40+ numbers shouldn't really affect the speed of the sieving.

I'm currentely at 293G, and during the weekend i can add another one or two machines, but progress is to slow on 400 Mhz machines. I definately can't make it far enough before we start testing 2M numbers.

Please post here which starting point you'll take so we won't duplicate anything. Later on we can merge the files.

Sander

philmoore
12-12-2002, 06:01 PM
SoB is already using Paul's software, but Paul is just offering to tailor a special verson to the project's needs.

Perhaps Paul can tell us if using 64-bit processors will be a significant advantage to the sieving effort. One advantage of using PCs is that there are plenty of them around, and a comparison of the amount of work to be done with the amount of computing resources available might mean that 32-bit machines make more sense in the end, even though they might not be as optimized to the task.

Sander (smh) says that he was removing candidates 10 to 15 times faster than they could be tested for probable primality. That suggests that the optimal sieving limit should be 10 to 15 times higher than it currently is, or in the range of 2T to 4T (2e12 to 4e12). (By optimal sieving limit, I mean the upper limit of p values being tested for being factors of any k*2^n+1.) Now he was sieving the range of n from 2M to 3M. Take into account that the time to do a prp test goes approximately as the square of n (please correct me, somebody, if I'm making a mistake), then in the range of n from 4M to 6M, we probably want a sieving limit for p of 8T to 16T, four times higher. I think it is important at this point to decide what n range needs to be sieved next. Current sieving goes up to about n=3M. We don't know how fast this project will grow, but it could be pushing this limit within a few months. The next stage of sieving should go from 3M (or even 2M) up to some upper limit which will keep the project busy for a couple of years. The cost of sieving is reported to be proportional to the square-root of the size of the range in n. (This would follow from the baby-steps, giant-steps part of the sieving algorithm.) What this means, is that doubling the size of the n range only increases the effort for a fixed p range by 41% (sqrt(2)=1.41). So it probably makes sense to increase it by as much as possible. n=10M would certainly keep the project busy awhile if it doesn't grow too fast. n>33M could actually put SoB into competition with GIMPS to discover the first 10 million digit prime.

However, and this is my main point, for any reasonable choice of new n range, the optimal amount of sieving work is large! For n=10M, I estimate over 20 of Sander's P3 years of work (and this is working on just one k-value), and this jumps to over 100 years for n=20M, since this range will eventually be sieved to higher p. If Louie thinks he can do this quantity of work on the cluster, great, but whether it is distributed on the cluster or distributed as part of the SoB general effort, he brings up a good point about the difficulty of combining work done on different p ranges on different machines. In the early stages of the sieve, many n-values are removed, but in later stages, according to Sander, comparatively few n-values are removed. Wouldn't it be more efficient in these later stages to just use lists of deleted n-values to update a master list? I thought that this sort of thing was what Louie was doing now until I read his last post, but surely there has to be a way of automating the rather tedious process he described.

Vato
12-12-2002, 08:32 PM
Isn't this where the SOB becomes more like GIMPS where the next client will support sieving (as opposed to trial factoring)
as an optional work unit type? It sounds like the right way forward - especially if there are other problems to attack afterwards...

Bellbox
12-12-2002, 11:41 PM
Originally posted by smh
You can take payams file (attached in the first post) as input file and start sieving form 350 or 400*10^9 (just change the first line of the input file).

You can remove all numbers between 1M and 2M from the file since it's already known that they are all composite. My save file (which has another 40+ numbers removed) is in the office, so i'm not able to retrieve it until early next week. But the 40+ numbers shouldn't really affect the speed of the sieving.

I'm currentely at 293G, and during the weekend i can add another one or two machines, but progress is to slow on 400 Mhz machines. I definately can't make it far enough before we start testing 2M numbers.

Please post here which starting point you'll take so we won't duplicate anything. Later on we can merge the files.

Sander

I started sieving at 400*10^9.

nuutti
12-13-2002, 01:50 AM
Louie and I sent our messages same time and he even few minutes before my so he actually answered my questions before I posted those questions to forum.

Yours,

Nuutti

smh
12-13-2002, 10:38 AM
Originally posted by Bellbox
I started sieving at 400*10^9.

Great, your P4 will go much faster then my P3. I'll do the sieving up to 400G. I'll see about it later what i will do afterwards.


Isn't this where the SOB becomes more like GIMPS where the next client will support sieving (as opposed to trial factoring)

Trail factoring is very inefficient on Proth numbers. Sieving is most efficient on a larger range of N's. Like Phil said before, sieving a twice as large range of N's to the same depth takes only 41% more time. To be efficient a client should recieve a rather large input file (although this can be compressed a lot) sieve for 1 or a couple of G and return the deleted N's. Not sure how much bandwith this will take. The server should delete the composite N's from the database and generate a new imput file.

Not impossible, but it's probably easier to see this as a seperate project. At least for the time being. The client and the server need to have new code, and i don't even know if the newpgen code is publicly available.

I haven't calculated exactely, but Phils numbers should be in the ballpark. A P3@450 needs 5 to 6 months to sieve a range of 1M n's to 1T. A larger range would be more efficient of course. I guess an average P4 is 6 times faster, and with the use of SSE2 maybe 10 times. But with 10 K's available at 3M there is still a lot of sieving that needs to be done.

I don't know what Louie's plans are, and we haven't reached 3M yet (will take a while i guess), but we should definately think about it. Maybe someone can set up a little web site to coordinate things (download/reserve etc). I might be able to do this if noone else wants, but it also depends on how Louie thinks about this.

ceselb
12-13-2002, 12:17 PM
I'd like to help if I can. What whould be best, starting 4847 at 500*10^9 or doing another k?

I have a PIV 1.5Ghz if it makes any difference.

ltd
12-13-2002, 12:37 PM
I would also want to help.

But to be honest i am not sure how the first line in the *.fil should be changed.

And the is the *.del file what i have to set as output file ??

And last point how do i tell the system to stop before reaching the range of the next siever ?

i can offer a AMD TB 800 full time or a AMD XP 2000+ part time.

Lars

smh
12-13-2002, 01:58 PM
I'm sieving upto 400G, and Bellbox is starting at 400G.

I don't know how far he plans to go. He has a P4 so i guess 100G would be a couple of days.

Of the first line in the *.fil file you only need to change the first number. This is the startingpoint and it should be about 208xxxxxxxxx in the zip file. If you start at 500G you put 500000000000 in there.

Now you can use the *.fil file as input file, and choose another filename as output. Something like *.out

It's important that you go to options, and then set 'Log the numbers removed'

This will put the removed numbers in the *.del file

Finaly, you set options => sieve until => sieve up to p=
whoever far you want to sieve.

If you stop the search and want to continue later you can rename the output file to *.in to use it as input, and *.out for a new output file. Remember that you check the 'Log the numbers removed' again

This forum is not really an easy way to coordinate, but it's the best we got so far. I don't know about other K's. I guess Louie is still sieving them to further depths

jjjjL
12-13-2002, 02:24 PM
It's cool to see people interested in getting involved here.

However, I don't want people to ahead of yourselves and duplicate my sieving work. Sieving on all the k values (except 4748) is already being done by me. I have an account on my University's super computing clusters. I can queue all kinds of work... it took me a week to sieve 33661 up to 1T. Right now I'm in contact with Paul Jobling and we should be making a threaded, multiple k siever specifically tailored to our project and my cluster's needs. There is no need for a community of dedicated sievers to pop up yet.

I understand that some of you have computers that can't run SB for one reason or another and you just want to help. I'm not going to stop you if you want to do random factoring. Anyone who sends me sieve results for any k will help because I will remove the tests (after verifying of course)... I'm just guessing that a lot of work will get duped since I have systems removing factors all the time. As it is, I think almost all the k values are sieved to about the right limit for n < 3000000. As it is, I'm removing less than 90 tests a day with the 24 systems running the sieving. I could run more, but diminishing returns are setting in so I don't want to waste too much time.

Right now I'm thinking about sieving the next set of values on a faster AMD based cluster with a larger amout of processors. I'll probably sieve from n = 3M to 30M or 40M all at once due to the fact that testing more n values at once takes very little added effort. I may even sieve higher... I'll probably get Paul's input on what he thinks would be a good balance before I start that.

So for right now, I'll let sander coordinate the effort to sieve k=4748 from n=2-3 mill. If you can collect all your results together and send them to me a little before those tests start being assigned (i few weeks from now i'm guessing), that would be great.

-Louie

ltd
12-13-2002, 02:36 PM
Thanks for the answers smh.

Because ceselb wants to sieve from 500 i will start from 600 upto 700.

Lars

smh
12-13-2002, 03:11 PM
Thanks for your answer Louie.

As you are removing 90 numbers a day on 24 computers, thats still 4 a day per computer. I guess even the fastest P4 available can't do one test in 6 hours with N around 3M, so it might be worth to continue a bit further.

Sieving all the remaining K's at ones would be really cool, especially if it would speed up things a little too. But 30 or 40M seems ambitious. It will take years to get there (GIMPS is not even at 20M, and a LL test is much faster then a proth prp test!)


Okay, I don't mind coordinating the sieving for K=4847 for a while.

I think 100G (100x10^9) are nice chunks for one searcher, so just post which chunk you'll take, and once you're done send the output file and the *.del file to sander3005 @ yahoo.com
I'll merge the files and send them to Louie once we are getting close to 2M. (delete the 1M to 2M part from the input file, there's no need to sieve them further). When you send me the files, please clearly state which range you sieved!

200 - 300 SMH (Done)
300 - 400 SMH
400 - 500 Bellbox
500 - 600 ceselb ???
600 - 700 LTD

ceselb
12-13-2002, 03:24 PM
Cool. It'll take ~2 weeks to sieve this chunk for me. Is there any significant gain if you remove eliminated N's for the other peoples ranges as you go?

philmoore
12-13-2002, 05:35 PM
How much faster is a Lucas-Lehmer test than doing a probable primality test on a Proth number of similar size? I would guess not more than twice as fast, maybe someone else can answer that.

True, 30 to 40 million seems like a long way off. But 10 million seems like a good minimum, and increasing the sieve size to 40 million only doubles the expected amount of sieving time to the same depth. For n around 30 million, the optimum depth is deeper, but as long as the project reaches an optimum depth for n around 3 million by time those numbers come up, sieving can continue for a long, long time to get to the optimum limit for 30 million. Remember, it's Seventeen OR BUST! Who can predict what this project will look like in 3 years? I don't think it is at all inconceivable that the exponents being tested will be in the 20 millions by then. GIMPS wouldn't be at the point where they are now if it hadn't been for continued hardware improvements, and we can only hope that more improvements will benefit this project.

One more point: Even though prp tests take longer, as more k values are eliminated, SoB will eventually end up with fewer candidates to test in a given range of n than GIMPS.

Mystwalker
12-13-2002, 05:49 PM
One :|ot|: question:

As all tested k's habe a Proth Weight > 0, there's definitly a prime (even infinite, to be more specific), right? So it's not the question if we find one, but when, is it?
Just for general understanding...

TIA!

philmoore
12-13-2002, 06:12 PM
Good question! If we definitely knew there was a prime for each k, we wouldn't even have to test! From our knowledge of distribution of primes, we can say that the probability that any of these infinite sets of numbers k*2^n+1 contains no prime value is zero, but that is not a rigorous estimate. On the other hand, the results of Payam Samidoost and others in the "dual Sierpinski project" makes it extremely unlikely that 13 of the 14 remaining k values can be a Sierpinski number. The one value which has not been eliminated as a dual Sierpinski candidate is k=28433, so finding a prime value for 28433*2^n+1 would be sweet! Joseph Arthur is currently testing near n=1,800,000 on this k value.

Samidoost
12-14-2002, 12:05 AM
Here is the links to the dual proth searches:

http://sierpinski.insider.com/28433
http://sierpinski.insider.com/dual

Just for those, who have not involved or informed yet, dual Proths are numbers of the form k+2^n, which could be considered as the negative continuation of Proth numbers k*2^n+1:

k*2^(-n)+1= k/2^n+1=(k+2^n)/2^n

For more details see
http://groups.yahoo.com/group/primenumbers/message/6982
and the messages which followed it.

Their most successful application was to narrow the search for Sierpinski and Riesel problems. Currently there is remained 1 candidate for Sierpinski and 11 for Riesel.

For the dual Riesel search you can see http://sierpinski.insider.com/riesel

Payam :)

Bellbox
12-14-2002, 12:20 AM
Well, it's running slow and steady as I expected; had a little glitch yesterday so I didn't really start till about noon today. Hopefully we can get this done in the next few weeks and start looking for those (not-so) elusive primes in this range.

Current Status:
405 billion
2 numbers eliminated.

shifted
12-14-2002, 02:35 AM
Originally posted by Samidoost
Their most successful application was to narrow the search for Sierpinski and Riesel problems. Currently there is remained 1 candidate for Sierpinski and 11 for Riesel.

For the dual Riesel search you can see http://sierpinski.insider.com/riesel

That URL is incorrect :(

Samidoost
12-14-2002, 02:45 AM
> That URL is incorrect :(

Now it is corrected.

Payam ;)

smh
12-14-2002, 06:59 AM
Originally posted by ceselb
Cool. It'll take ~2 weeks to sieve this chunk for me. Is there any significant gain if you remove eliminated N's for the other peoples ranges as you go?

Removing the numbers between 1 and 2M will save 40% of the time (i you haven't already done that.)

I realized i have a save file at home which has all the candidates left when sieved to almost 300G. I removed a couple of more numbers already, but that doesn't really speed up things.

Samidoost
12-14-2002, 07:19 AM
Sander

May I request the list of removed numbers? Your attached file does not contain them.

Payam

smh
12-14-2002, 09:49 AM
I didn't wanna make things too messy, so i just copy / pasted the removed numebers and sent them to your e-mail account.

Samidoost
12-14-2002, 10:52 AM
Thanks, Sander

I will include them in http://sierpinski.insider.com/4847

Payam

ltd
12-15-2002, 05:33 AM
One more question,

is there any possibility to tell how many values you expect to find while sieving in a gieven range?

I ask because i have sieved now from 600G to 612G and have no hits yet.
Which looks not very effective in the first place.

May be i find something in the next days.

Lars

smh
12-15-2002, 05:50 AM
Dunno, but all i know is that there can be a big difference in how many numbers you remove in a given range.

I had a couple of times that no nimber was removed in over 6G, but the closest numbers removed had a prime factor less then 4M away.

This is the breakdown of numbers removed in a 10G range (starting with 210-220G)

2-6-4-8-6-4-5-3-9

As you can see, a big difference.

And the higher you sieve, the less candidates you remove

ceselb
12-15-2002, 06:37 AM
I wouldn't worry too much. I'm at 515G and haven't found any either. They'll come eventually.
As I undersdtand it. these values will take much longer to process than the current ones, so a day or two apart is still worthwhile.

Bellbox
12-15-2002, 10:48 PM
I'm currently at 426G, and have found 7 numbers so far.

ltd
12-16-2002, 10:52 AM
I should have started complaining a little bit earlier. :bang:

Only two hours after my mail i had the first four numbers removed. :rotfl:


Lars

ceselb
12-16-2002, 02:11 PM
Then it's only me without any removed then... (I'm at 524.6B if anyone wonders)

Nice to see some progress elsewere though :)

smh
12-18-2002, 02:30 PM
On the primnubers mailing list Phil C. wrote:


My choice of 3M as my sieve range was a curious one. As the sieving algorithm has roughly O(n^(1/2)) behaviour, where n is the range of the exponents (500000..3000000 would be 2500000), I have several criteria to balance:

1) n not too large,
a) as it's slower
b) as several k would crack early, everything beyond that would be 'waste'
c) as sieving next year is quicker than sieving this year! (~Moore's law)

2) n not too small,
a) as then the overhead of sieving vs. testing would increase
b) as O(n^(1/2)) still favours large n, being better than linear

If fewer than half of them have split by 3M then I'd say I got the balance
right. If more than 8, I'd say we sieved too large a zone.

For the future, the ones that reach n=2M probably should have sieving in a
band of about 3M-20M started, with the same reasons as above.

Just gut feel.


Louie, Phil M,

Any news about a customized newpgen which can do several K's at the same time?

smh
12-18-2002, 02:35 PM
On the primnubers mailing list Phil C. wrote:


My choice of 3M as my sieve range was a curious one. As the sieving algorithm has roughly O(n^(1/2)) behaviour, where n is the range of the exponents (500000..3000000 would be 2500000), I have several criteria to balance:

1) n not too large,
a) as it's slower
b) as several k would crack early, everything beyond that would be 'waste'
c) as sieving next year is quicker than sieving this year! (~Moore's law)

2) n not too small,
a) as then the overhead of sieving vs. testing would increase
b) as O(n^(1/2)) still favours large n, being better than linear

If fewer than half of them have split by 3M then I'd say I got the balance
right. If more than 8, I'd say we sieved too large a zone.

For the future, the ones that reach n=2M probably should have sieving in a
band of about 3M-20M started, with the same reasons as above.

Just gut feel.


Louie, Phil M,

Any news about a customized newpgen which can do several K's at the same time?

jjjjL
12-18-2002, 03:18 PM
Paul Jobling sent me the code required to do the multiple k sieve this morning. I am busy rewriting the SB client right now but I should find time soon to also write the threading code I want for the siever.

I'll let you guys know how it performs when it's done. Based on Paul's email, it looks as though there may be quite a significant performance gain. We shall see. :)

-Louie

frmky
12-20-2002, 03:50 AM
Hi,

Last week, I started to work on 900G. Unfortunately I wasn't able to tell you about it since apparently my ISP's spam filters really don't like this forum's activation emails. :mad: After numerous attempts, I finally gave up and changed the address to a junk email account I have but never check...but I got the email.

Anyway, since I was having problems getting it working, I started at 900G to reduce the chance that someone else would snatch up the range in the meantime. With the winter break coming up, though, I'm going to have to stop it soon, so it doesn't look like it'll get much beyond 950G. Good news is that it's removed about 10 numbers. I'll send them to ya either tomorrow or this weekend.

Greg

smh
12-20-2002, 07:47 AM
Great, just send me both the output file and the newpgen.del file.
Make sure you clearly state which range you sieved!

Today i recieved a mail from Paul Jobling with a new (Beta) version of NewPgen 2.71.

I did a quick test, and for numbers with a low weight (like the ones we are sieving) it gives a speed increase of about 20% on my P3.

Since i don't have any e-mail addresses of the people who are sieving i can't send it. So if you want it, drop me an e-mail (sander3005@yahoo.com)

ltd
12-20-2002, 08:20 AM
Hi,

i will do 700G to 800G also if that is not blocked ba anyone else.


Lars

P.S. The range from 600G to 700G will be finished today with over 20 numbers removed. :|party|:

ltd
12-20-2002, 10:22 AM
Range from 600G to 700G is finished. Results are on the way to Sander.

Finaly 27 numbers are removed.

Lars

smh
12-20-2002, 10:38 AM
Thanks

27 numbers in a range of 100G is on average one number per 3,7G.

At a wild guess, my P3 would remove 1 candidate about every 15 hours. I can't remember how long that pc needs for a test with N between 2 and 3M, But i think it's at least 3 days, probably getting closer to 5 if N get high into the range. So sieving is still at least 6 times faster at this depth.

All wild guesses, and probably different for faster machines, but the point is that it's still worth sieving deeper.

Bellbox
12-20-2002, 06:47 PM
Well, I'm at 480G, removed 22 numbers so far. Should finish sometime in the next 2 days (maybe 3 max).

ltd
12-21-2002, 03:24 AM
The performance gain when using NewPgen 2.71 Beta
is much higher for me.

With version 2.70 i could crunch 0.286G per hour
With version 2.71beta this is 0.48G per hour.

Both on a PIII 700MHz.

Brilliant job done by Paul Jobling.

Lars

Bewolf
12-21-2002, 07:40 AM
Has anyone taken the 800-900G range?

If not, I will start in on that range.

eatmadustch
12-21-2002, 05:54 PM
just a question out of curiosity ... why does everyone want to sieve the lowest sirpinski candidate 4847, when there are 12 others?

smh
12-21-2002, 05:59 PM
As far as i know 800-900 isn't taken by anyone.

200 - 300 SMH (Done)
300 - 400 SMH
400 - 500 Bellbox
500 - 600 ceselb
600 - 700 LTD (Done)
700 - 800 LTD
800 - 900 Bewolf
900 - 9?? frmky

smh
12-21-2002, 06:07 PM
Originally posted by eatmadustch
just a question out of curiosity ... why does everyone want to sieve the lowest sirpinski candidate 4847, when there are 12 others?

27653 is already tested to 3M (only 5 numbers pending)
33661 was tested to 1T earlier on (it was the second K the project started testing on.

And the other 10 currentely are tested by louie

ltd
12-21-2002, 06:07 PM
why does everyone want to sieve the lowest sirpinski candidate 4847, when there are 12 others?

Thats because all the other K are sieved by Louie.

So only 4847 is our playground.

Lars

Edit:

Hi Sander you have been a little bit faster with the reply.:)

frmky
12-22-2002, 09:36 PM
The 900G-1T range is finished. 22 numbers were removed, and the files have been emailed.

Greg

Bewolf
12-22-2002, 10:34 PM
Paul Jobling did an great job speeding up NewPGen.

v2.70=0.437G/hour
v2.71b=1.12G/hour
On a 1.4Ghz Tualatin Celeron @ 1.6Ghz

Thats an incredible 156% speed increase :)

Bellbox
12-22-2002, 11:50 PM
I'm all done with 400-500 so I'll be emailing the output and deleted numbers shortly. Removed about 31 numbers.

Samidoost
12-23-2002, 02:49 AM
Hi

I have started a new thread about k=28433 sieve. New results from Joseph Arthur up to 318G.

Payam

ltd
12-24-2002, 08:55 AM
Range from 700G to 800G is also ready.
Removed 19 numbers.

Lars

smh
12-24-2002, 01:34 PM
Just now i merged all the files that were sent to me. Until now, sieving has been done from 208G - 500G, from 600G - 800G and 900G - 1000G (so only 500-600 and 800 - 900 are missing).

We started with 4221 candidates left above 2M (including a few Payam already proved composite). The *.del file now contains 202 numbers, of which 3 are double, so 199 candidates are removed and we are left with 4022 candidates.

A break down of how many numbers were elliminated:

208-300 49
300-400 54
400-500 32
500-600
600-700 27
700-800 19
800-900
900-1000 22


so we can expect about 50 more numbers in the two missing ranges.

Bewolf
12-24-2002, 02:06 PM
I'm about half done with the 800-900 range with 10 numbers removed so far. I should be done with the range by the end of the week.

Are we going to stop at 1T? On average, sieving on my computer removes one number every 4.25hours. That is a lot faster than the SB client can test a number. Maybe it would be worthwhile to sieve a little farther.

smh
12-24-2002, 04:12 PM
It's definately worth sieving further. Sieving is still a lot faster then prp testing, and even if sieving eliminates candidates at the same speed, you still won't loose anything.

My P3(450) removed about 3 numbers a day around 350G, while it takes at least 3 (but could be 5, i don't have the logs anymore) days to do one proth test. Thats still 10 to 15 times faster!!

My guess is that as long as we remove 4 or 5 numbers per 100G it's still faster than prp-ing. At least for my slow P3(450). I don't know how this is for a P4 or athlon.

If anyone wants to take a range, please post it here. You don't necesarily have to take 100G, as long as you clearly tell what range you'll do/have done.

I'll attach the merged file which has all candidates left after sieving to 1000G (except 500-600 and 800-900)

One more thing, if i send you the beta of Newpgen v2.71, can you please check the date the exe was created. The program itself says 19-12-2002, but the executable should be of 20-12-2002.

The 19-12-02 version contained a bug that misses some factors. I just want to make sure i sent everybody the correct version.

ltd
12-24-2002, 06:03 PM
my executable has a timestamp 20.12.2002 10:56 is that the correct version ??

And by the way i will take 1000G to 1100G if nobody else has started this range.

Lars

smh
12-24-2002, 06:31 PM
Originally posted by ltd
my executable has a timestamp 20.12.2002 10:56 is that the correct version ??

And by the way i will take 1000G to 1100G if nobody else has started this range.

Lars

Thats the right one. I'm pretty sure i sent the right one to everybody, but i got a little confused by version numbers, different pc's , and two different e-mail accounts i used to recieve the beta. Just wanted to make sure.

Okay, i'll continue at 1100G, it will take a while, but we're not in the 2M range with testing numbers yet.


200 - 300 SMH (Done)
300 - 400 SMH (Done)
400 - 500 Bellbox (Done)
500 - 600 ceselb
600 - 700 LTD (Done)
700 - 800 LTD (Done)
800 - 900 Bewolf
900 - 1000 frmky (Done)
1000 - 1100 LTD
1100 - 1200 SMH


=> EDIT:

I got worried just now when i saw that there were only 4021 numbers to test, opposed to the 4022 i wrote earlier on.
I was wrong, i once the lines in the output file and once included the first line, which is obviously no number to test, Payams file had a blank line in the end (and notepad just counts that as a line)

So we started with 4220, found 199 different N's composite (and 3 duplicates) so there are 4021 numbers to test.

MERRY CHRISTMAS ALL

ltd
12-25-2002, 06:01 AM
Info from Bommer:

He will do 1200G to 1300G.


Lars

Bellbox
12-25-2002, 09:50 AM
Could you send me the new version of newPgen? I'd be happy to start sieving 1300G to 1400G.

smh
12-25-2002, 04:19 PM
I've just sent v2.71 to you.

Just to make things clear:


200 - 300 SMH (Done)
300 - 400 SMH (Done)
400 - 500 Bellbox (Done)
500 - 600 ceselb
600 - 700 LTD (Done)
700 - 800 LTD (Done)
800 - 900 Bewolf
900 - 1000 frmky (Done)
1000 - 1100 LTD
1100 - 1200 SMH
1200 - 1300 Bommer
1300 - 1400 Bellbox

Bewolf
12-26-2002, 01:33 PM
800-900G will be done today.

I will start on the 1400-1500G range.

ceselb
12-26-2002, 02:00 PM
I'm currently away. I'll be back some time after new years. My range will (hopefully) be done by then...

How did you guys manage to get them done so fast? Multiple computers?

ltd
12-26-2002, 04:22 PM
At the moment my AMD XP2000+ is doing the job.
At over 1040G the system crunches more then 1G per hour.
So 1T to1.1T only needs about four days for me.

Lars

Bellbox
12-26-2002, 10:55 PM
Just a quick question about the new beta version of NewPGen (working fine by the way), SSE2 code has not been implemented yet, right?

Looks like it'll take about 5-7 days for me to complete the range.

smh
12-27-2002, 01:41 PM
Originally posted by Bellbox
Just a quick question about the new beta version of NewPGen (working fine by the way), SSE2 code has not been implemented yet, right?


No idea actually, but i don't think so. The what's new section doesn't say anything about SSE2.

We're below 4000 (3998) remaining numbers now. Below 1T only the range 500 - 600 is missing. We've got to wait 'till newyear for that.

I can't edit my posts, so here is the list with ranges in progres again

500 - 600 ceselb
1000 - 1100 LTD
1100 - 1200 SMH
1200 - 1300 Bommer
1300 - 1400 Bellbox
1400 - 1500 Bewolf

Angus
12-27-2002, 07:03 PM
What is it that you all are doing?

Is this some project within the S0B project, or something unrelated? There seems to be a whole lot of activity with this sieving, and I don't see anything about it on the SoB web site.

smh
12-28-2002, 11:55 AM
Originally posted by Angus
What is it that you all are doing?


In short:

We try to find factors for all numbers 4847*2^N+1 with N between 2 and 3 million. Most of these numbers can be divided by a small prime. If this is the case for a particular N, the we don't have to do a test with the SOB client since it can't be prime anymore. At the moment, of the 1 million N's there are only about 4000 numbers left with the SoB client.

When the tests that are assigned are near 2M, i'll send the results to Louie who can remove the numbers from the database.

Joe O
12-28-2002, 03:46 PM
Originally posted by SMH
When the tests that are assigned are near 2M, i'll send the results to Louie who can remove the numbers from the database.

At the current rate, 2006030 will be the N value in 4.5 days.

Bewolf
12-31-2002, 01:08 PM
1400 to 1500G is done with 11 numbers removed.

That averages out to one number every 8.2 hours. IIRC, the SoB client takes 12-15 hours for one test on my comptuer. There is still some advantage to sieving, but not much.

smh
12-31-2002, 02:22 PM
Originally posted by Bewolf
1400 to 1500G is done with 11 numbers removed.

That averages out to one number every 8.2 hours. IIRC, the SoB client takes 12-15 hours for one test on my comptuer. There is still some advantage to sieving, but not much.

I suppose the 12 to 15 hours is for exponents around 1,8 or 1,9M? Don't forget that numbers double the size take about four times longer! So sieving is still 2 times faster around 2M, it will be 4 or 5 times for exponents around 3M! And i guess you have a P4? If so, the advantage of sieving will be much greater due to the lack of SSE2 code in newpgen (at least i think it still lacks the SSE2 code, if not, it doesn't matter much which cpu you are running on).

If the client could be made 20% faster, that would be a big improvement, but sieving is at least 100% faster. I do call that a lot!

ceselb
01-01-2003, 03:18 PM
500-600G is done, 35 numbers removed.
Is anybody working on 1500-1600G yet, or have we sieved far enough?

smh
01-01-2003, 03:43 PM
As far as i know, nobody is sieving 1,5T-1,6T yet.

Sieving is still a lot faster, so it's definately worth sieving further, otoh, we're almost at N=2M so i think i need to send the results so far pretty soon.

My 1100G to 1200G range is at 1156, but i don't have spare cpu cycles to push this further, so you might want to continue from there too.

ceselb
01-01-2003, 05:49 PM
Ok, I'll start at 1156 using the beta version of NewPGen.

jjjjL
01-02-2003, 02:21 AM
Over the holidays it dawned on me how easy it would be to make a web form to allow people to submit factors. :)

Since a factor is easy to verify, I could allow anyone to submit them. I may allow people to submit with a username if they wish and then numbers factored could be part of people's stats again. As it is, there is already a spot in the database for factors so it's just a matter of writing the script. I'll let you all know when I (or Dave) get such a script working.

Maybe Paul Jobling will release the multiple k sieve version of NewPGen if he knows that everyone can use it. As it is, I have been having some problems making his code thread safe for my cluster siever I was planning. His code is great, but it was designed for speed... not threading.

As for the current sieving... Sander, if you have a full delete file or sieve file for 4847 it would be convient if you could email it my way. Payam tells me that the new numbers are also at http://sierpinski.insider.com/4847

Anyway, I'll let everyone know when I make a factor submission/verification form. Maybe once that's done, there can be a more central spot where people can "reserve" ranges, download realtime updated sieve files, get links to paul's siever, and instructions on how to use it.

-Louie

philmoore
01-02-2003, 06:03 PM
Paul sent me this on December 18th and asked me to forward it to the list, but I was out of email contact over the holidays:

Hi Phil,

still can't post in the SoB forum, so could you put this information in there
for me?

I have produced a cut-down version of the fixed k sieve in NewPGen, and I have
enhanced it to sieve more than one k value at a time. I have also tinkered
with it a little, with the result that in the same amount of time, this can
sieve about 50% further than NewPGen 2.70.

I have sent the source code to Louis, and while he can't redistribute the
source, he can produce and distribute executables based on the source code.
The code can be built on Windows and Linux. Since it makes very extensive use
of x86 assembler, it will not be possible to produce a fast Solaris port.

Initially Louis was intending to use his 40 processor Linux cluster to do the
sieving, however there is nothing to prevent this being built into the client
at some point in the future, should the SoB coordinators decide that would be
useful.

I believe that the intention is to use this to sieve n values above 3 million
for the 13 remaining k values. It could be used with the remaining n values
below 3 million, but to be useful they should all be sieved to the same level
first.

Regards,

Paul.

jjjjL
01-02-2003, 07:01 PM
I came here to give good news. All the sieve results that sander sent me have been manually merged into the database. this removed quite a few tests for 4847 so that's always good. I also added tests for k=28433 between 2mill and 3 mill.

So today marks the first day that either of these k values has been assigned. :) This means that SB is now officially testing all remaining k's, not just 10 out of 12. Hopefully this will help us find more primes soon.

-Louie

Bewolf
01-03-2003, 12:23 AM
@Louie: A central point point to reserve ranges for sieving would be really cool. I can't wait to see it happen.

@Sanders: Your right, I looked back at my logs and the n's that took 12-15 hours were all around 1.5M. My latest block has an n of 1.9M and it looks like it will take 28 hours to complete. So sieving is still a whole lot faster.

smh
01-03-2003, 04:18 AM
Louie,

Are you purposely retesting the numbers Payam already tested or did you forget to remove them?

Sander

Cowering
01-03-2003, 01:21 PM
Is proving the k+2^n dual problem only has 28433 left to check the exact same as what we are doing?

In other words, do we really have to FIND the prime for the non-28433 numbers even if there is proof they already exist for some N ?

I am good at math, but this is getting out of my league :(

jjjjL
01-03-2003, 02:39 PM
smh - yes
cowering - no

-Louie

Mystwalker
01-03-2003, 04:31 PM
A "yes" answer to an either/or question? Hm... :D

I'm quite interested in an illustration of that "negative continuation" stuff. Are the same k/n pairs primes (seems like it's not the case - so where IS the connection)? And as I'm afraid I'm not deep enough into prime theory, could someone tell me why it's easier to find those duals?

philmoore
01-03-2003, 05:19 PM
This seems a bit off-topic for this particular thread, so maybe if we want to discuss it further, we could start a new one. However, this topic was discussed on the primenumbers group several months ago, and a search there on "78557" turned up several interesting posts to that group. You might start with this particular one:
http://groups.yahoo.com/group/primenumbers/message/7042
which proves, not only that 78557*2^n+1 can never be prime, but also that 2^n+78557 can also never be prime. The relation of the two problems is discussed in other posts as well, in quite a bit of detail. The eventual conclusion is that elimination of a k-value from the dual Sierpinski project makes it extremely unlikely that it can be a Sierpinski number, but it does not prove it.

Samidoost
01-03-2003, 11:42 PM
There are 3 different problems:

(1) original Sierpinski problem, Which asks about the smallest k, such that k*2^n+1 is composite for all n. SB have sucessfully removed 5 hard candidates out of 17. (a really great work which must be continued)

(2) dual Sierpinski problem , Which asks the same question for the sequence k+2^n, which could be considered as the negative continuation of the Proth sequence k*2^n+1:

k*2^(-n)+1=k/2^n+1=(k+2^n)/2^n

for this problem there remains 8 candidates to be removed:
2131, 8543, 28433, 37967, 40291, 41693, 60451, 75353
see http://sierpinski.insider.com/dual

(3) mixed Sierpinski problem , asks the same question for both the sequences k*2^n+1 and k+2^n, simultaneosly.

for this special problem there remains only one candidate to be removed, namely k=28433.

Payam

Samidoost
01-04-2003, 12:10 AM
Originally posted by Mystwalker
could someone tell me why it's easier to find those duals?

It is not easier when we are considering the same n.

There is some minor situations which must be considered:

(1) the easy making part: The original candidates have been searched up to high n limits, while the duals are not, yet.

(2) the difficult making part: The pfgw which is used for finding duals is a little slower than PRP or SB which is used for finding Proths.

for more details you can visit http://sierpinski.insider.com/dual

Payam

ceselb
01-07-2003, 12:18 AM
1156-1200G done, 6 numbers removed.
Starting on 1500-1600G.

Bommer
01-07-2003, 03:46 AM
I sieved the Range from 1200G-1300G and removed 21 numbers.

Cu Bommer

ceselb
01-08-2003, 03:24 AM
Should we organize an effort to sieve all k's from n > 3M, or is that being integrated into the SOB client?

MAD-ness
01-08-2003, 04:21 AM
Just got caught up on this thread.

Good work guys. :)

smh
01-08-2003, 09:05 AM
Originally posted by ceselb
Should we organize an effort to sieve all k's from n > 3M, or is that being integrated into the SOB client?

IIRC Louie is testing a program that can sieve all K's at once. Haven't heard much on how he is doing.

I think the problem of making the sieving distributed is that it's most efficient to sieva a rather large range of N's. At least upto 10M but maybe 20M.

12 K's times 13M values, thats a lot. After sieving a couple of G a wild guess would be that there are still over 1M values left to test. The newpgen output format is in plain text, so it will be several Mb to start with. Even compressed this will still be a lot of data to send to all clients.

ceselb
01-08-2003, 09:26 AM
Sure, it's a large download, but you'd only need to get it once. After that you'd update/remove values from the *.del files . If nobody is willing to automate this I'm sure it could be done manually, either here or by mailinglist (egroups (http://groups.yahoo.com)?)

philmoore
01-08-2003, 12:23 PM
I agree with ceselb, even though the list of numbers not yet eliminated by the sieve would be large, it would not have to be downloaded very often, and it seems that so far, the number of people interested in doing sieving is relatively small. I made a rough estimate of how many numbers would be left if all exponents from 3,000,000 to 20,000,000 were sieved to 400G for all 12 remaining k-values, and came up with about 700,000. Two bytes per number means a 1.4Gbyte file, and if instead of sending all n-values, only a difference in successive n-values was sent, this could be compressed by at least a factor of 2. Increase the range to 3,000,000 to 30,000,000 and we get around 1,100,000 remaining exponents. Of course, we can hope that eliminating a couple of k-values will reduce this size in the near future!

ceselb
01-08-2003, 01:06 PM
wo bytes per number means a 1.4Gbyte file

No, it doesn't. I'm bored so I sieved all 12 remaining k's with m from 3M - 20M to p=10000. These files are 29.4MB totally, 5.37MB zipped. Not very large at all.

Bewolf
01-08-2003, 01:19 PM
As a test I sieved k=4847 for n=3m to 20m up to 4.4G at which time there were roughly 86,000 number left. The newPGen file was 1.2MB. Assuming this was an average for all k values, a file for the remaining k values would about 14megs after only 4G. This can be zipped down by 80% so the remaining file would be less than 3megs, which isn't to bad to download.

I think the harder part would be organizing the sieve. A webpage for reserving ranges would be the best (even better if it generated the input file for the range). Keeping track by hand would be tedious even with a handful of people.

philmoore
01-08-2003, 02:59 PM
Sorry, I meant MB when I said GBytes! But the input file would be the same for all sievers, since they would all be sieving the same range of n values.

ceselb
01-08-2003, 03:08 PM
I have a zip of all k's done to p=2M, here (http://www.visit.se/~ceselb/all_k.zip) (3.54MB) if anybody is interested.

Nuri
01-08-2003, 03:34 PM
It seems like Louie will take care of the organization, so it won't be a serious problem. He wrote;

"I'll let everyone know when I make a factor submission/verification form. Maybe once that's done, there can be a more central spot where people can "reserve" ranges, download realtime updated sieve files, get links to paul's siever, and instructions on how to use it."

I didn't participate in the 4847 sieve because I could't help much for 100G chunks (unfotunately I have only one PC and it's CPU time is already distributed between a three projects including SoB and GIMPS).

I'd love to participate when Louie launches the web form (especially if it enables reserving less than 100G for participants like me). I'm sure some other people will also join when it's launched and announced at the news page.

smh
01-08-2003, 05:00 PM
I'd love to participate when Louie launches the web form (especially if it enables reserving less than 100G for participants like me). I'm sure some other people will also join when it's launched and announced at the news page.

I didn't want to have too small ranges when i collected the results on K=4847 because i was doing everything by hand. 50G would have been okay too, but if you'll have to do it by hand it really gets hard to keep track of what ranges are done and which files are merged etc.

I just took a look at the sieving resulds of ceselb.
When sieving upto 2M there are a total of 1278701 (+/- a few) numbers numbers remaining. So my guess wasn't that bad :)

I tried the K with the most numbers left (55459) and when i started sieving i was removing almost 150 numbers per second. (P4@2,4Ghz), dropping to 65 numbers per second when sieving to 5M (and removing over 10000 composites). So i think we would be able to get this down to less than a million in not too much time. But i don't know how far Louie is in implementing things, and if it makes sence to give the remaining K's a good headstart.

Bellbox
01-08-2003, 10:21 PM
Yikes, sorry guys, I have been having problems with the PC I was using to sieve and I won't be able to continue with the sieveing until I get it back up and running. Left it running over the winter break, and when I got back it was toast. It got up to 1340G, so I'll post the files here if anyone cares to continue the range up to 1400G.

Sorry once again, and I'll be back as soon as I can figure out what's up with the stupid sieving PC.

The removed numbers:

p=1316562430213 divides n=2443743
p=1318610142973 divides n=2531703
p=1318880710747 divides n=2415447
p=1339017372383 divides n=2717703

jjjjL
01-09-2003, 12:27 AM
Yeah. Paul has finished a stand alone windows program that does the multiple k sieve. He sent me a copy and I'm very excited about it. Not only is it faster than NewPGen 2.70... it is also capable of sieving all k-values at once. This is huge improvement over single k-value factoring. This will make sieving to even higher ranges a viable, and intelligent option. If enough people sieve, this could have a decent impact on overall system throughput. :)

Right now we're trying to decide on a resonable sieve range and finishing up the web submission page. I'm thinking n = 3 - 5 million. There is still time to change the "official" sieve file but don't get too hung up if I decide to limit the n range. By the nature of the sieve being public, you don't have to use my "recommended" range if you don't want. If you sieve a larger range, that will work too. Basically, do whatever you want to do to find the factors that you want to find. There's nothing even saying that you need to use the program Paul made for us if you work for the NSA and know a better way to factor these numbers. ;) That's the beutiful part... factors can be obtained however you choose to obtain them. It's incredibly trivial to verify them. In fact, I have finished a factor verification backend already. It's like 20 lines of code. Seriously.

When Dave has time, he's gonna help me integrate it into a Mason script to parse down results. At that point there will be a new sieve page on the site. It will have instructions, a link to download the multi-k siever, and two submission forms. One will be for the regular siever results. It will be a multi-line text box with a submission button. When you see the siever, you'll understand how easy this will all be. The other submission page will be for old sieve results from NewPGen... it will have the multi-line text box and also a place to specify the k value the results are for. And for the begining, I'll have a direct link to the forum so people can coordinate the sieving effort until I can make a range "reservation" signup and such.

I know you're all excited to get this stuff up and running but it will probably be a few more days yet. I'm excited too but I want to give things a chance to test out and solidify before public factor submission begins. And we still need to write most of the sieve page. ;)

When it's done, you'll all be the first to know. There will be an announcement on the homepage probably too.

-Louie

MAD-ness
01-09-2003, 01:45 AM
Very, very cool.

I anxiously await this being finalized so people can contribute to finding factors. Especially if it helps to increase the overall progress of the project.

Bellbox
01-09-2003, 02:50 AM
Sounds like a great new option.

ceselb
01-09-2003, 03:01 AM
Yes, very cool.
I'm volunteering as betatester and/or pre-launch siever (to reduce download size) if there is any need.

Bewolf
01-09-2003, 03:20 AM
Excellent news Louie.

One question, I have been sieving k=4847 for n=3m to 20m for a couple days. Will this submission page take n values that high (if I read correctly, it will) and will it be able to handle a 1.5MB .del file produced by newpgen (or whatever equivalent form you script will accept it)?

ceselb
01-09-2003, 06:01 AM
Before I forget: Did all 12 k's from before up to 25M. If anybody is interested it's here (http://www.visit.se/~ceselb/all_k_to_25M.zip) (zipped, 3.05MB).

philmoore
01-09-2003, 02:07 PM
That's great news from Louie about the new sieve! My only reservation is that a sieving range of n from 3 million to 5 million seems a little conservative. We ran through all the exponents from 1 million to 2 million in about a month. At that rate, allowing for the fact that testing an exponent takes time approximately proportional to n^2, testing the range from 3 million to 5 million should take just over a year. However, since we also are hoping to attract new participants to the project, we have to assume that it may take less than that. Consider that the time it takes to sieve is proportional to the square-root of the sieving range. Therefore increasing the range by half to 3 million to 6 million only requires 22% more sieving time, but will generate twice as much work for the project. (The square-root of 3/2 is about 1.22.) Increasing the range to 3 million to 10 million would require 87% more sieving time, (since the square-root of 7/2 is about 1.87) but generate 10 times the work for the project. Of course we also can make future sieves more efficient by eliminating more k-values, but even eliminating 2 or 3 more k-values won't buy us the efficiency that increasing the sieving range does.

MikeH
01-09-2003, 04:11 PM
This is really good news.

With a good semi-automated sieving process we might finally start sieving to a good depth. While sieving is pulling out numbers at a rate which is greater than that which we would expect the SB client to process numbers in the same n range, then all this shows is that we're not sieving deep enough. The optimal point should be where sieve and expected SB rates are equal.

This means if you're running the sieve on a P4-1.7, and you're sieving for n around 3,000,000, if you're eliminating more than one number per day, then you're not sieving deep enough.

And from some of the posts I've seen on this thread, it would appear that we're not sieving deep enough.

Or maybe I'm missing something……

smh
01-09-2003, 04:28 PM
Originally posted by MikeH
The optimal point should be where sieve and expected SB rates are equal.


True, but don't forget that we are looking for primes, not for composites!! So testing can start as soon as a reasonable sieving depth is reached. You's never know that N=3000123 might produce a prime, and it would be a wast to sieve a whole range for weeks.


Originally posted by MikeH
And from some of the posts I've seen on this thread, it would appear that we're not sieving deep enough.


No, not deep enough, but it takes a lot of cpu time to sieve deep enough. K=4847 has had a couple of 1Ghz weeks of sieving. But coordinating this by hand is really hard.

I agree with Phil Moore, that 3-5M is a bit too small range for sieving. 7 to 10M seems more reasonable. We don't have to sieve the whole range extremely far before starting to test 3M numbers. The sieving can continue in the mean time.

Louie, Will you provide a way to reserve a range to sieve, or at least make it available on the site which ranges are sieved, to avoid duplication of work?

MikeH
01-09-2003, 05:31 PM
I also agree, opening the ranges a bit wider than 3 - 5M seems like a good idea.

I've been following this thread for a while, and since I have about 5 additional PCs which are on 24/7 but not able to get net access, I have thought about joining the sieving effort, but this "paper" co-ordination hasn't looked too good to me. So once Louie gets the web form and web co-ordination sorted, those PCs will be on the case.

I'm hopeful that there are other members of the project in similar cimcumstances. For sure we will then achieve deeper seiving, which can only be a good thing.

(But can we please make sure that a reserved range doesn't expire in 5 days!)

Nuri
01-09-2003, 06:00 PM
Originally posted by MikeH

(But can we please make sure that a reserved range doesn't expire in 5 days!)

I'm sure it won't. ;)

I guess you're right. The new sieving function might also attract new people who knew the project but didn't participate at full throttle (if at all) because of the net connection problem.

jjjjL
01-09-2003, 06:11 PM
I think that 3 - 20 million makes more sense than 3 - 5 million since there is a small hit in speed by extending the range. If there's a chance we will test a range, we should start sieving it now basically. I'll start building the official sieve file to this size.

For now, the forum will be the coordination center to reserve ranges. There will not really be an expiration in this system.

Hopefully there can be a reservation system on the website and stats on which ranges have factors submitted already (and how many) but that won't be available right away.

I'd also like to point out that there may come a time where double checks of proth tests may be important. In other words, you shouldn't simply work to equal the rate of sieving to the rate of SB. A factor for a number is far more valuable than a proth test for the same number since it can be verified so easily. So I would say that it's more correct to sieve until it's at least 2x slower than just running SB. GIMPS uses the same methodology for the same reasons.

That combined with the fact that the new siever is faster than before means that sieving will probably never be done to an optimal level no matter how much CPU is thrown at it. THATS OK though. As smh reminds us above, we are looking for primes, not just the composites. A good % of SBs power should still go to the process that will actually find the primes since sieving a k deeper makes little sense once there's a prime. Also, you have to realize that sub-optimal sieve levels are only slightly sub-optimal. They don't rob the project of a more than a few % of it's total thoughput.

I am making the sieving public mainly for 2 reasons:

1) the project will no longer be dependant on my access to the U of M super computing clusters to sieve work and is not dependant on me manually creating new workunits.

2) by storing all the factors in the database, it will be an excellent consistency check to insure that we didn't somehow miss any numbers. by having the system more tightly coupled, it avoids possible human error on my part.

Minor reasons for making the sieve public include:

1) More processing power being put into sieving making sieve levels closer to optimal.

2) there is the possiblity that people will be able to factor the numbers in inventive new ways to seriously reduce the test space

3) I can spend my time I spend manually removing factors people send me doing something else. It's not a big deal every now and than, but when I get a few a day, it's kinda tedious and it's definately unnessisary.

OK. I'm writing a book on this. I'm done. Hope that helps answer some questions on how this will all work.

Summary: get ready to do lots of sieving. :)

-Louie

MAD-ness
01-10-2003, 04:01 PM
Since I lost my DSL connection, actually participating in the project with my machines is not very feasible for me.

I am another who would be able to contribute to the project by sieving who would not be able to contribute otherwise.

I would imagine that by time all is said and done, the resources involved with sieving will not end up much from the search for primes. Between machines that could not previously be used being put to work sieving and increased interest in the project as a result of the public sieving effort, we should break even or even gain computers in searching for primes. Only IMHO, of course.

ceselb
01-15-2003, 11:30 AM
1500-1600G done, 5 numbers removed.
Starting on 1340-1400G.

smh
01-15-2003, 12:19 PM
I just saw you sent me the removed numbers.

Could you send them directely to Louie? There's no need for me anymore to sit in between, and i'm not able to send the file until this weekend.

ceselb
01-15-2003, 01:55 PM
ok, sent to Louie. :cool:

tom ehlert
01-16-2003, 03:59 PM
Just a question -

as I understand, your new siever is able to sieve on *all* sierpinski numbers left to check (12 currently) at the same time, with a decent speedup, compared to sieving each separate.

there is a project (Riesel primes), similar to SB, searching for primes of form k*2^n-1 (sierpinski is +1), where some 113 numbers are left.

would it be possible to change the 'SB multi-k sieve' to check these numers as well (shouldn't be a problem IMO)

this might accelerate the other (riesel) project quite a bit

tom

paul.jobling
01-17-2003, 04:59 AM
Just a question -

as I understand, your new siever is able to sieve on *all* sierpinski numbers left to check (12 currently) at the same time, with a decent speedup, compared to sieving each separate.


That is correct.

there is a project (Riesel primes), similar to SB, searching for primes of form k*2^n-1 (sierpinski is +1), where some 113 numbers are left.

would it be possible to change the 'SB multi-k sieve' to check these numers as well (shouldn't be a problem IMO)

this might accelerate the other (riesel) project quite a bit


Indeed, it is very possible - in fact, just last night I created a version of the software to do exactly that, and sent it to Wilfrid Keller and Ray Ballinger. The only changes required - apart from cosmetic issues - were removing one minus sign and removing the characters "p-" in two places.

Regards,

Paul.

tom ehlert
01-17-2003, 06:18 AM
> in fact, just last night I created a version of the software to do exactly that, and sent it to Wilfrid Keller and Ray Ballinger

So I was a little late in suggesting it ;)
Thanks

Bommer
01-17-2003, 01:17 PM
Where can I downloading the software ????

Cu

MikeH
01-17-2003, 04:20 PM
Louie wrote
I know you're all excited to get this stuff up and running but it will probably be a few more days yet. I'm excited too but I want to give things a chance to test out and solidify before public factor submission begins. And we still need to write most of the sieve page.
From Paul's recent postings, it would appear that things have moved forward. We now look forward to a "SoBSieve".

Any update on how things are going, and as to when we meagre mortals (the public) can start sieving?

Nuri
01-19-2003, 04:44 AM
SoBSieve started.

For those who haven't noticed it yet, find it at Louie's thread here (http://www.free-dc.org/forum/showthread.php?s=&threadid=2370)

Regards,

Orkun

jreyesiii
01-19-2003, 02:04 PM
Does "SobSieve" support SMP ? The other thread is for the coordination of reservation of p (sieve) ranges.

ceselb
01-19-2003, 02:10 PM
Not natively, but I imagine you can start 2 copies and set the affinity by hand.

Perhaps it's better that you ask in the other thread.