Sieve Coordination Thread [old]

Printable View

Show 50 post(s) from this thread on one page

01-18-2003, 10:59 PM
jjjjL

Sieve Coordination Thread

Hi guys. The public sieve is... well... public :):

http://www.seventeenorbust.com/sieve/

Most of the info you need to get started is there.

The community should start at p = 25 billion. I'm gonna finish the range below that myself.

I'd recommend 25 billion wide p ranges since the n range is so wide (n = 3 - 20 million).

i. e.

first person: 25 - 50 billion
next person: 50 - 75 billion
next person: 75 - 100 billion
etc...

Don't let what you experience during the k=4847 sieving decieve you... sieving 17 million n-wide takes a lot longer than 1 million.

That said, it won't be 17 times slower. Paul Jobling managed to make SoBSieve several times faster than NewPGen was. This implementation is even faster than the private 2.71 beta that a few of you have.

If 25 billion wide don't make sense, change it but don't do a low p range unless you're SURE you can do it. It is very important that we get to a decent sieve level before the server starts assigning n-values above n=3 million.

If you are using a slow computer or are new and unsure if you want to do sieving, please reserve a range above 1 trillion for now.

In the future, I hope to make something on the sieve page to reserve ranges without using this forum but for now please coordinate using this thread exclusively.

One last reminder: Make sure you're logged into the SB site if you want to have it remember that you submitted the results (for future stats purposes). The sieve submission is really simple. That site again is:

http://www.seventeenorbust.com/sieve/

There will be an anouncement on the site about it tomorrow. Email me if you run into any problems.

Happy sieving folks. This should be interesting! :)

-Louie
01-19-2003, 12:25 AM
ceselb

Yes, finally. :)

I'm taking 25 - 50 billion.
01-19-2003, 12:52 AM
kmd

I'll do 50-75 billion.

I looked, and there was an option for alpha setting. It said it can be used to optimize performance by doing calculations differently. Is the optimum alpha random from machine to machine, or does it have to do with what specs you have?

Also, how long does it take to finish one range? It's not like I have a slow computer, I'm more wondering how big of a commitment i'm making here.
01-19-2003, 02:59 AM
ceselb

It's BIG. Something like 45 days on my P-IV 1.5Ghz. I'm adding another (slower) box tomorrow, but it'll still take over a month.
Will we reach 3M before that?
01-19-2003, 03:44 AM
cjohnsto

As candidate prime numbers are removed from the large datafile how are you going to keep the clients up to date. Will you say once a week give a new datafile to download or a patch file or Will you not care that some time is being wasted looking for factors in numbers where one has already been discovered?
My preference is patch files as well as a weekly released updated datafile. Or even better would be an automated system where the seiver automatically sends results and removes candidates from the local file.
01-19-2003, 03:50 AM
cjohnsto

I'll take 75-100 billion
01-19-2003, 04:23 AM
Nuri

I'll take 100-125 billion.

By the way, what does "badass" mean? (From "Sieve Result Submission" page)
01-19-2003, 04:55 AM
Nuri

Quote:

Originally posted by ceselb
It's BIG. Something like 45 days on my P-IV 1.5Ghz. I'm adding another (slower) box tomorrow, but it'll still take over a month.
Will we reach 3M before that?

Does the "Rate" (p/sec) decrease as we sieve further, or is it only me? I have started only half an hour ago, but it has already dropped down by 3%. So, if that is the case for everybody, it might take much longer than 45 days. :(

I guess we will start testing 3M+ towards the end of february (unless we find a "couple of" primes before that, which will decrease the number of k's)

Quote:

Originally posted by kmd
I looked, and there was an option for alpha setting. It said it can be used to optimize performance by doing calculations differently. Is the optimum alpha random from machine to machine, or does it have to do with what specs you have?

And, does anyone know the answer to that question? Is it fine as it is? Or should we try to optimize it ourselves?

Thanks.
01-19-2003, 05:24 AM
McBryce

I'll take 125 - 150 billion.
01-19-2003, 06:51 AM
smh

Unfortunately a 25G range is way to large for me, it'll take almost a month on my P4 2.4GHz, and the only machine i have which can run 24/7 is a slower PIII 450.

I think it's a good idea to make the ranges a bit smaller above, lets say 500G. I think it will be hard to get a few people who want to commit 2 Ghz months just to sieve one range. Maybe 10G ranges above 500G and 5G ranges above 1T?

If it's possible to get a reservation system on the website, it even be possible pick 1 or a couple of free 1G ranges?

Currentely there are over 825000 numbers left, but i guess the first few days quite a lot of numbers will be removed. So the SoB.dat file should be updated every now and then. My guess is that 650000 numbers will be left arount 1,5T.

Just for testing, i sieved 250.000.000.000 to 250.020.000.000 and removed 4 numbers. Since i've no way of finishing a 25G range in a week (i'll be flying to south east asia for almost a month next week :|party|: ) i'll stop it here and submit the factors.
01-19-2003, 07:15 AM
FatPhil

Behaviour of alpha, and speed as a function of p

The cost of testing a prime is almost certainly
time(generating the next prime)
+#bigsteps * time(bigstep)
+#babysteps * time(babystep) * #ks

The initial term, sieving for p, should be negligible, and shouldn't make too much difference. I'm sure Paul's sieve uses a n.loglog(n) algorithm, so it will slow down slightly as the p range increases, but there will be fewer primes to test, so the total time to do a p range should remain flat. (In my own sieves they tend to decrease.)

The big steps are modular multiplications and writing entries into some kind of hashing structure. These are quite slow.

The baby steps are either simple halvings or doublings, and lookups into the previously created hash table. These are very fast, assuming you can stay within the cache. However, you can't! Nonetheless they're still quite quick.

Both of the above depend on the exact machine architecture being used. However, one that's faster for the big steps will almost certainly also be faster for the baby steps too. The exact ratio may vary though.

To perform a discrete log (which is what fixed-k sieving is), the product of the number of big steps and baby steps must exceed the n range.

So all in all the cost is
C0 + C1*B + C2/B
Where B is the cost of the big step, and C0,C1,C2 are constants.
Now while this appears to shoot off to infinity, is does have quite a broad area where it will be quite flat.
For example, some times for my own particular discrete log with various alphas:

0.6 1m54.690s
0.8 1m51.680s
1.0 1m49.290s
1.2 1m49.300s
1.4 1m48.690s
1.6 1m49.040s
1.8 1m50.010s
2.0 1m50.680s
2.2 1m51.660s
2.4 1m51.730s
2.6 1m54.270s
2.8 1m55.940s
3.0 1m56.720s
3.2 1m58.240s
3.4 1m59.830s

Now while it might appear obvious the 1.4-1.5 would be best for the above run, times of _identical_ tests across multiple runs can vary by up to about 2% or more, much more than the half-percent separating the times above. I get the feeling that it's more sensitive to the state of the cache than it is to the alpha parameter. (The jump between 2.4's 1:51 and 2.6's 1:54 is an example of this difference, I'm sure.)

So, in summary - don't worry to much about the exact value of alpha, as long as you're within a range of the theoretically perfect value your rate won't vary hugely.

Phil
01-19-2003, 08:01 AM
nuutti

I will take a small one

I will take 150 - 155 (small range)

Yours,

Nuutti
01-19-2003, 12:57 PM
paul.jobling

Hi guys,

About that "alpha" value.... Louis said it might cause trouble :)

It is basically a tuning parameter for the algorithm that is used. Different numbers of k and different ranges of n require different alpha values. For the Sierpinksi work where there are 12 k values and the n range is 3 to 20 million, I found that a value of 3 gave the best performance on my PC, though your mileage may vary.

To get the best rate, just look at the 'p' rate and try to maximise that by changing alpha. You can quite happily set an alpha value, start the sieve, leave it for a few minutes, stop it, then repeat the process, as the software keeps track of where it has got to and knows the range that is being sieved.

The original algorithm didn't have that parameter, as those of you who have used NewPGen will know. However while I was attempting to get the very best performance out of this software I found that this was somethimng that was required.

Regards, and happy sieving,

Paul.
01-19-2003, 01:12 PM
MikeH

I'll take 155-175 billion.

Thanks,
Mike.

P.S. Thanks for all the hard work Louie et al, it really is appreciated.
:cheers:
01-19-2003, 01:17 PM
paul.jobling

Alpha=3.2 looks good.

Hi guys,

I'll take 155 billion to 175 billion. That should take this XP 2100+ about 20 days, with alpha set to 3.2.

NOTE Using the default value of alpha (0.5) is not recommended - my rate was ~7000 p/sec at that; with alpha set to 3.2 the rate is ~11200 p/sec.

Regards,

Paul.
01-19-2003, 01:18 PM
paul.jobling

MikeH beat me to it! OK, I'll take 175 to 200 million (quickly presses Submit before that goes too....)
01-19-2003, 01:22 PM
dudlio

Race conditions...yeouch. I'll take 200-210 billion then.
01-19-2003, 01:29 PM
MAD-ness

Well, it looks like Paul just offered up the requested performance/completion time estimate that people wanted. :)
01-19-2003, 01:30 PM
RangerX

I'll take ***-*** billion. Why not; I leave my comp on all the time. A 1 gHz will probably take a decent time to finish just that small range though (especially with both sb and sbsieve running :D).

EDIT: I changed my range so I blanked out the one above to drop confusion.
01-19-2003, 01:34 PM
McBryce

Hi,

I have to change my reserved range from 125-150 billion to 125-130 billion.

So 130-150 billion is free to reserve!

Martin
01-19-2003, 01:37 PM
RangerX

I'll take 130-150 billion instead :D
01-19-2003, 01:38 PM
dudlio

In terms of timing, figure 1 billion = 1 day. Roughly. If you want a more accurate estimate, download the client, do some timings and then reserve. My 10G is going to take two weeks... :/
01-19-2003, 01:49 PM
smh
And since we're into smaller ranges, i'll take 210-215G

so far:
- 0 - 25G Louie
  25 - 50 ceselb
  50 - 75 kmd
  75 - 100 cjohnsto
  100 - 125 Nuri
  125 - 130 McBryce
  130 - 150 RangerX
  150 - 155 nuutti
  155 - 175 MikeH
  175 - 200 paul.jobling
  200 - 210 dudlio
  210 - 215 smh
01-19-2003, 03:16 PM
RangerX

How many of the # | #*2^#+1 things are we expected to get? I've got 6 right now and I haven't even broke 130.1 billion yet...

EDIT: Also, will the list be saved in between program runnings? Or should I submit when I close the program?
01-19-2003, 03:25 PM
ceselb

Quote:

Originally posted by RangerX
How many of the # | #*2^#+1 things are we expected to get? I've got 6 right now and I haven't even broke 130.1 billion yet...

Fewer and fewer the further up we go. I'm getting lots, but my range is lower than yours.

Quote:

EDIT: Also, will the list be saved in between program runnings? Or should I submit when I close the program?

Progress is saved in the SoBStatus.dat file. When you submit them, be sure to remove the "pmin=x" and "pmax=x" lines, as the script can't handle them.
01-19-2003, 04:08 PM
dmbrubac

I'll take 215-225
01-19-2003, 06:03 PM
RangerX

Seriously tinker with the alpha values. I just about doubled my original speed.
01-19-2003, 06:11 PM
Nuri

Can anyone help with this please?

After running SoBSieve for 12 hours, I had to stop it and restart my PC. When I restarted, I saw that all of the factors in the SoBStatus.dat file was lost (although p started from where it left).

The problem is not so serious for that case, just 12 hours of work were lost. I simply installed the client to another folder and restarted testing. But of course, I don't want to experience that once more.

In any case, I decided to take occasional backups of the SoBStatus.dat file and submit factors more frequently. That way, I won't have to start from scratch everytime.

The questions are:

What do you think? Is it me that did something wrong, or may it be because of the client? Any ideas?

Louie: Is it possible to rearrange the Factor Verification Results page so that is shows previously submitted factors (of the user) as well? (if it would be very difficult to show all of the factors submitted, something like the last 10 factors will still help a lot).
01-19-2003, 06:36 PM
eatmadustch

sorry for this rather embarassing question ... if I were to take say 225-230 that would mean I would have to enter 225000000000 (9 zeros) as pmin and 230000000000 as pmax?

How long would this take on a PIV 2.53 GHz with 512MB RDRAM?

NOTE: I'm NOT reserving this range, this is right now just a question of curiosity. I'll reserve a range later, when I know how long this would take, so I know how bigger range to reserve!

Also, does this test just for 4847 or all of the remaining 12? I thought Louie already sieved up to 10T for all but 4847?!
01-19-2003, 07:07 PM
ceselb

Quote:

Originally posted by eatmadustch
that would mean I would have to enter 225000000000 (9 zeros) as pmin and 230000000000 as pmax?

Yes.

Quote:

How long would this take on a PIV 2.53 GHz with 512MB RDRAM?

A fair guess is 4-6 days, give or take... (ram size and speed isn't a factor, afaik)

Quote:

Also, does this test just for 4847 or all of the remaining 12?

All 12.

Quote:

I thought Louie already sieved up to 10T for all but 4847?!

Yes, n= 2 - 3M was sieved quite far. This however is a sieve for n= 3 - 20M. SoB will reach 3M in a little over a month or so.
01-19-2003, 11:25 PM
EggmanEEA

a helping hand

Since I have a bunch of other cpu's chewing on SoB, I'll run my own cpu (p4 2ghz 1gb DDRRAM) on the following range:

225-250 Billion

If I can get a few more cpu's set up soon, i'll put a few more on sieving as well. With the flood of new users lately, we're going to need to feed them a lot of new numbers.

Regards,
EggmanEEA,
TeamRetro
01-20-2003, 02:18 AM
jjjjL

I think the estimates of 1 month before n=3million values start getting assigned is a good one. I would say that our goal should be to sieve to at least p=1T before the first tests start going out. When we get close to the day when the first tests with n > 3 mill are about to be assigned, I may temporarily decrease the expiration time back to 5 days so we can get as much sieving as possible done before the values start going out.

I will release a new sieve file once I finish my range 0 - 25 billion. It will probably be a few weeks still. At that point, I'll create a script to generate the sieve file directly from our database so that people's submissions can all be used to make the search space smaller as the sieve progresses. It will probably be something that gets automatically rebuilt on a regular basis (daily I'm thinking) and uploaded somewhere for people to grab.

As things progress, the sieve file will not shrink too much and updating the file constantly will become less and less important. The big shrinks will come when k values get eliminated. :D And the decrease will likely be larger than 10% since statistically, the k values with higher weight (read: more values to sieve) are more likely to have lower primes. For instance, if we find a prime for k = 55459, then ~15% of the numbers being tested would be elminated.

BTW, I just reconfigured the verification script to run at a niced priority. It may run a little slower now. Shouldn't effect anyone besides me since I'm submitting huge blocks right now. For instance, a block of 4000 valid factors takes about 2 minutes to verify. 1 minute to do the division and another to do the mysql queries. I could optimize it, but I don't think it's nessisary. The average user should not be submitting 4000 factors at a time. ;) In fact, I will probably raise the lower submission limit to 25 billion once I'm done with my range.

If for some reason you do have that many factors to submit (or more!) you should be aware that your browser might not like you using that much of the text box. For instance, in Opera, it will let me put more than 3000 lines in, but if I do much over 5000 at a time, it will not send anything when I submit (even though it lets me put it all in the submission window.) I'm pretty sure it's a browser issue. Is this going to effect anyone besides me? If so, let me know and I'll look into it deeper and make sure I can't fix it somehow.

One last thought I just had. If it's manageable, I think I'll post a dynamic list of the factors above a certain bound somewhere. That way everyone can go though and check for "holes" in the sieve and patch them up. Let me know what you guys think.

That's all I can think of for now. Submissions seem to be rolling in so it looks like everything is working well.

Happy sieving folks! :)

-Louie
01-20-2003, 04:29 AM
paul.jobling

Quote:

After running SoBSieve for 12 hours, I had to stop it and restart my PC. When I restarted, I saw that all of the factors in the SoBStatus.dat file was lost (although p started from where it left).

...

What do you think? Is it me that did something wrong, or may it be because of the client? Any ideas?

OK, I know why this happened. It is because you had checked "Create new SoB.dat file". So PLEASE PLEASE everybody - make sure that this is unchecked before you stop the program.
01-20-2003, 06:41 AM
paul.jobling

1 Attachment(s)

VERY IMPORTANT!!!

I attach an upgrade to the software. This does two things:

(1) The "Create new SoB.dat file" option has been removed.

(2) Alpha is set to 3 by default.

To upgrade:

- First go to Options, and MAKE SURE THAT "Create new SoB.dat file" is UNCHECKED. If this is checked, it will delete the SoBStatus.dat file when you stop the program *

- Exit the software

- Copy this new software and run it.

Regards,

Paul.

* This was designed behaviour - instead of producing the SoBStatus.dat file listing the removed values, instead it writes out a new SoB.dat file. However while that is useful for myself (for testing) or Louis (to produce the file in the first place), it is not useful to you.
01-20-2003, 07:02 AM
ceselb

I downloaded a copy, but there was already 1 download by then.
Didn't install it though, so no problem. :D
01-20-2003, 02:35 PM
Nuri

Thanks for the info Paul. :thumbs:

I remember having checked it thinking that it should be done to maximize output (!) by removing k,n pairs that have a factor from the SoB.dat file. :D

Of course, I didn't know it restarts the SoBStatus.dat from scratch. :scared:

Anyway, may be it's a good think that it happened that early.
01-20-2003, 02:40 PM
frmky

I'll take 250-255 billion.

Greg
01-20-2003, 03:42 PM
jjjjL

I just updated the download with the new version of the siever.

There is also a slightly newer sieve file too. Download it.

http://www.seventeenorbust.com/sieve/

-Louie
01-20-2003, 05:31 PM
RangerX

How often should we submit? Is there any particular moment, like before we reboot?

Also, if we accidently submit data that we've already submitted, does the upload script check to see if it's already in the database before adding it?

I'm asking because I haven't submitted anything yet, but by how long it's taken me just to get to 130.26 million, I'll probably be needing a reset before it's done and I want to make sure I don't lose anything.

EDIT: Also, I really don' tknow that much about what's going on and even less about the number theory behind this. I just chip in my processor because it's not doing anything anyway and I think math is a good cause. But my question is, is this sieving being done for it's own purpose, or is it going to help out with finding primes for the sb program? Also, what are all the k, n, and p values that keep getting mentioned?

Finally, I've been running this thing for about a day now and I'm at 130.26 million, so at this projected rate I probably won't get up to 150 million by the end of the month. So should I release from 140-150 million, or should I maybe stop running sb for a little bit (after the next data set is finished) and let sieve take up my full idle processor?
01-20-2003, 06:50 PM
Alien88

im taking 255-260 billion.

Show 50 post(s) from this thread on one page