Sieve Client Thread

Printable View

Show 50 post(s) from this thread on one page

06-13-2003, 08:58 PM
Mystwalker

Definitely much faster. :)
06-14-2003, 11:37 AM
paul.jobling

Quote:

Originally posted by jjjjL
hi Robert. welcome to the discussion. :)

i'm still digesting what you've suggested. i'm sure paul will have something to say about it when he reads it. anyone else think he's onto something?

mklasson - have you done this with your siever? is the math good and does it yield an improvement?

-Louie

Louie et al,

I am on a well-deserved holiday travelling around Spain right now, and my only chance to hit the internet is today while we are in Grenada and Karen is "shopping for shoes" :)

I saw Robert's ideas here and also in an email he sent to me... I haven't had time to look at them in detail (in an internet cafe time really is money) but I will definitely look closely at them when I get home in a weeks time.

BTW, are there any programmers from Spain here? How do you cope with the square brackets and vertical bar (|) being so difficult to find?

Regards,

Paul.
06-14-2003, 12:09 PM
expinete

Hi Paul, you can get the |, [ and ] with the Alt GR key

Alt GR + 1 = |
And the two keys at the left of the key P with the ALT gr give you the brackets.

It isn't difficult iif you have worked with a Spanish keyboard all of your live ;)

Enjoy your holydays!
06-14-2003, 02:15 PM
Mystwalker

Quote:

And the two keys at the left of the key P with the ALT gr give you the brackets.

That sounds like the normal US layout, doesn't it?

Btw. if you come to Germany, [ and ] are Alt Gr + 8 resp. 9, | is Alt Gr + <>-key. ;)
06-15-2003, 11:30 PM
Joe O

Quote:

Originally posted by jjjjL
i optimized the factor database today. factor submission should be noticably faster now. let me know.

-Louie

Much, much faster!

Anywhere from 40x to 100x faster!

That's 4000% to 10000% faster for those of you who like %.
06-16-2003, 08:20 PM
b2uc

When a K is found/completed whatever...are the .dat files quickly changed to exclude the K that is no longer needed and thus keep us from spending time finding factors for that K?

In other words...is it such a good idea to reserve a month or so of work when a K could be found at any time? Or is it not that much more work for the client to scan 12 instead of 11K's? Would we get like an 8% boost if we just did a day or twos work at one time and then changed the .dat when a K is found? I know a lot of people may point out that it's a long time inbetween finding K's...but I'm an optimist:)

Talk real slow on your explanation...I run cat5e for a living...not program in asm and I don't think I've ever read books that start out... "The world around in you: Explained in number theory"
06-17-2003, 05:35 AM
Mystwalker

That is no problem. All you have to do is changing the leading 12 to 11 and erasing the list of n's assigned to the found k. As these are sorted by k, this is done within a minute. If you find a prime, I'll show you. :jester:
06-18-2003, 11:58 AM
b2uc

Question: I'm sieving 32300 to 32350 and on of my factors found is 21181*2^3392204+1 Now this number is less than 4Mil...has this number already been previously sieved and probably already prp'd? Will numbers less than 4 million just be thrown away by the script..or is this a new find all together?
06-18-2003, 12:07 PM
mklasson

Quote:

Originally posted by b2uc
Question: I'm sieving 32300 to 32350 and on of my factors found is 21181*2^3392204+1 Now this number is less than 4Mil...has this number already been previously sieved and probably already prp'd? Will numbers less than 4 million just be thrown away by the script..or is this a new find all together?

If it's the first known factor for a k,n pair it can always be used to eliminate the need for a double check prp.

Mikael
06-22-2003, 06:43 PM
Keroberts1

would it be possible just out of curiosity sake to sive just for one K value by delewting all of the data for the other values?
06-22-2003, 06:58 PM
ceselb

yes, that's exactly what will be done when we find the primes.
06-23-2003, 11:22 AM
paul.jobling

1 Attachment(s)

SoBSieve 1.34

Hi,

Attached is a 1.34 release of SoBSieve. This is almost identical to the 1.33 release - it does a couple of things in a different order to help the Sieving Server development effort, but apart from that the performance is the same as before.

Regards,

Paul.
06-23-2003, 02:07 PM
Mystwalker

Thanks a lot, paul! :cheers:

But one question: We already had version 1.34, didn't we?
06-24-2003, 04:07 AM
paul.jobling

Quote:

Originally posted by Mystwalker
Thanks a lot, paul! :cheers:

But one question: We already had version 1.34, didn't we?

Did we? I thought the last release was 1.33 - though I might have put a 1.34 up for a while then withdrawn it...

Anyhow, if it doesn't cause a problem I won't rerelease. BTW, does it work as required now? I look forward to distributed sieving.

Regards,

Paul.
06-24-2003, 08:03 AM
jmblazek

Quote:

Originally posted by paul.jobling
Did we? I thought the last release was 1.33 - though I might have put a 1.34 up for a while then withdrawn it...

Anyhow, if it doesn't cause a problem I won't rerelease. BTW, does it work as required now? I look forward to distributed sieving.

Regards,

Paul.

I started sieving 3 weeks ago and I've only used the 1.34 version. I downloaded it from MikeH's "All Things SoB Sieving" website:

http://www.aooq73.dsl.pipex.com/

It was the recommended client to use for dual range sieving. Is it the same??? or do I need to download the rerelease???

John
06-24-2003, 08:28 AM
Joe O

John,
Your 1.34 is OK. Paul released and pulled a 1.35
Paul,
Welcome back from vacation/holiday.
06-24-2003, 08:31 AM
Joe O

Re: 1.34 SoBSieve

Quote:

Originally posted by paul.jobling
Hi all,

I found the problem that was cuasing the extra factors to be found. It has been fixed in the attached release. I strongly recommend that you up grade to this version!

Regards,

Paul.

Paul, I do hope that you based this new 1.34 on the old 1.34 not on the 1.33.
06-24-2003, 08:33 AM
Joe O

Distributed Sieving

Paul, could you enlighten us about this?
06-24-2003, 09:23 AM
paul.jobling

Ok guys,

It look like I forgot about a 1.34 release somewhere along the way. This is the latest code though, so there shouldn't be any problems with it. Sorry about that! I can make a release with a 1.35 label if you are desperate for it?

Paul.
06-24-2003, 02:33 PM
Mystwalker

Quote:

I can make a release with a 1.35 label if you are desperate for it?

It would make the requirements list easier if there's only one new client with that version. ;)

Quote:

BTW, does it work as required now? I look forward to distributed sieving.

Honestly speaking, I haven't tested it myself. But my fellow student didn't comment, so I think it is working. Testing today was quite good. :thumbs:

Other question:
supersecret is approaching n=300k. When do we plan to raise the lower limit of DC and dual sieving?
06-25-2003, 04:31 AM
paul.jobling

1 Attachment(s)

OK, here is one branded 1.35 for y'all.
06-25-2003, 04:37 PM
Keroberts1

how many N values that are in the current dat file have already been removed? it appears to me that this should be a rather largr number since this dat includes values that were between 3-4 million. I understand cutting the range has been talked about before but there are also a lot of values that have already been removed by sieving and these values don't need a double check so why do we still check the mwith the sieve. Could we speed up the siever with a new updated dat file?
06-26-2003, 06:19 PM
Keroberts1

If someone with a little bit more knowhow than me could offer some input on this i would really like ot hear it perhaps by using an automatically upadting dat file we could build a siever that would constantly increase speed as the project continues This would allow the siever to obviously achieve much greater speeds as time progresses perhaps extendeding greatly the life of productive sieving.
06-26-2003, 06:47 PM
mklasson

keroberts1:
the siever basically has to process every n value between the lowest you want it to find and the highest, i.e. between 3M and 20M currently. It doesn't matter how many k,n pairs there actually are in that range. This isn't _really_ true, but it's more so than not. :) If you get down to very few remaining pairs, there's a chance they all fall in the same residue classes modulo larger values than what's currently used.
06-26-2003, 06:50 PM
jjjjL

the idea that removing sieved out numbers would increase speed has been posted here before. paul has said that it doesn't help. phil confirmed that paul's conclusion was correct. they wrote the sieve so they should know.

some have claimed slight speed increases by clearing extra values out of the dat file. if someone convinced me that there were a true benefit to having an updated dat file, there would be an automated system to make new ones but i still don't believe it.

-Louie
06-26-2003, 07:09 PM
ceselb

Well, SoBSieve might use a little less memory with less numbers to keep track of. The resulting reduction in swap use could make a slight speed increase.

Also, if you update the SoB.dat the p-1 factorer package (and updates) could be made smaller.
06-26-2003, 07:43 PM
mklasson

Hmm... I just came across some sort of oddity. A few moments ago I found the factor
32560064474489 | 55459*2^7234354+1
with my own siever. I reported it at .../sieve and was told it was a new factor. Then I ran sobsieve on the same range and to my surprise it didn't find that factor. It turns out that 55459*2^7234354+1 isn't included in the sob.dat I was using (my program in its infinite wisdom didn't bother to check there... :) ).

How come? My factor was reported as new and there's no factor for that k,n pair in any of lowresults.txt and results.txt, so why wasn't 55459,7234354 in the sob.dat (dated Jan 30)?

EDIT: it just occurred to me that it might be because some p < 1G is a factor for that k,n... Is this the case?
06-26-2003, 08:10 PM
Joe O

Quote:

Originally posted by mklasson
EDIT: it just occurred to me that it might be because some p < 1G is a factor for that k,n... Is this the case?

If so, it will appear here tomorrow.
06-26-2003, 09:14 PM
Joe O

Quote:

Originally posted by jjjjL
some have claimed slight speed increases by clearing extra values out of the dat file.
-Louie

Well I'm one of those that reported a slight increase in performance, BUT it took a 22% decrease in the size of the file to get a 2% increase in performance.
I had neglected to change the sob.dat file on one of my lower n range machines. I was still using the original file. I replaced it with an updated file that was the last one before the change to 300K as a lower bound just to try it.
That's the story, folks.

It hardly seems worth it for all the work Louie would have to do.
06-26-2003, 10:01 PM
Keroberts1

ok well thanks for getting back ot me at least maybe once we have removed more than half of the range then we can trim it and get a signifigant increase in speed
06-29-2003, 03:02 PM
Moo_the_cow

Has anyone managed to figure out and implement Robert's idea to speed up the sieve client yet? I'm not a math wizard, so I was unable to do so :(
06-30-2003, 04:45 AM
paul.jobling

Quote:

Originally posted by Moo_the_cow
Has anyone managed to figure out and implement Robert's idea to speed up the sieve client yet? I'm not a math wizard, so I was unable to do so :(

Ah, well I read it, and the problem with implementing it - as with any "order of 2" idea - is that there is no fast way to find the order of 2 mod p. The only way it can be done is to completely factor p-1, and that is slow. A partial factorisation can be obtained (finding the prime divisors <100000, say), but the information that it provides on the order of 2 would be so weak as to make the test essentially useless.

However, all ideas are gratefully received, and Robert's contribution was welcome. Please keep any ideas you have coming!

Regards,

Paul.
06-30-2003, 04:49 AM
paul.jobling

Linux port

Hi,

I have got a Linux version of SoBSieve... does somebody want to test it for me? If so, PM me with your email address.

Regards,

Paul.
06-30-2003, 04:54 PM
b2uc

Question: If I'm doing a week or so of seiving and I notice a factor around the 3.9mil to 4.1 mil range...or if I'm not sure when exactly I'll finish my range and I see these low factors...should I just cut and paste the few of them to /sieve/ before the proth test suddenly catch up to them? Like yesterday outta like 30+ I sent in at the completion of a test...some were as low as 3.8m and it took them just fine as I presume new factors.

Also.

Question 2: Can you alter the Linux cliet just a tad so it will run optimal on FreeBSD? If I can run it on FreeBSD I can put forth a lot more effort. No Linux here.:whip:
07-04-2003, 11:10 AM
Robert Gerbicz

After I read Paul Jobling's (last but one) posted mail I have re-read my first posted mail
with an algorithm to this forum. Unfortunately I haven't written down a main
thing.

In the fifth step of the pseudo code I have written only that:

5. Do the general big step-baby step method.

Yes it is a big step-baby step method but not the original algorithm. We know for
every remaining k value that there are only few possibilities for n mod D. (see
fourth step: in the algorithm D>=48 ). And this fact reduce the required cost
dramatically. Because let c=Sum{for every remaining k value the number of
possibilties for n mod D}, then let n=nmin+a+b*arange if arange is divisible by D
then we know for every remaining k value the n mod D so we know also: a mod D.
If D is large then the cost of a baby step is equal to the cost of a big step. So
we can easily optimize the required cost: Let
baby step=D
arange=isqrt(D*(nmax-nmin)/c)\D*D+D (\ means integer division)
brange=isqrt(c*(nmax-nmin)/D)+1

The total cost is about 2*sqrt((nmax-nmin)*c/D) multiplications mod p in the 5-th
step of the pseudo code. And this is a small cost because if we are using the
original algorithm then we use for example D=48,because we are know that there
are only few possibilities for n mod 48 for every remaining k value.In our algorithm
D>=48 so our algorithm is faster in this step by about a factor of sqrt(D/48). ( If we
use D=48 then the baby step is a little faster then a big step because it is easier to
multiple a number by 2^48 mod p). The required cost in the previous steps won't be
much large if we are using the primes up to for example 1000 for the factorisation of
(p-1).

So we aren't use only the big trick to discard k values. We can
reduce the required multiplications mod p in the 5-th step of the
pseudo code. And if we aren't use many multiplications mod p in
the previous steps then the sieving method will be faster.
07-10-2003, 10:43 PM
Keroberts1

If the size of the dat. file doesn't matter for the sieving speed (or only minorly) then parhaps we should add the values for the 20million + N ranges. After all we are progressing fast and though we won't be geting there any time soon it also appears that we won't be getting the last primes without getting there first. Maybe it'd be moreefficient to include those values now so we don't have ot repeat huge amounts of sieving.
07-11-2003, 05:15 AM
paul.jobling

Quote:

Originally posted by Keroberts1
If the size of the dat. file doesn't matter for the sieving speed (or only minorly) then parhaps we should add the values for the 20million + N ranges.

That would have to be a completely new sieving effort, rather than being added to the current data - we want to check them for divisiblity against the 'small' primes as well, after all.

Values up to 20 million will probably keep S0B busy enough for the forseeable future :D

Regards,

Paul.
07-18-2003, 02:35 AM
Keroberts1

I've noticed that some of the factors that are removed from the dat by the siever are clumped together in some sort of a pattern. I was wondering if this could be used to in some way learn some thing new about these types of probabilitys. Perhaps some distinct patern could help us discover our own ways to speed up the algorithim. Just a thought I'm not even sure if the factors are indeed clumping I just noticed this i nmine and ithought it was unlikely ot happen.
58130163354677 | 55459*2^5984698+1
58130502895411 | 33661*2^5974872+1
58130637894977 | 28433*2^14473633+1
58133832667853 | 21181*2^9694100+1
58139256462389 | 24737*2^12631087+1
58139814210671 | 24737*2^19963543+1
58143005430539 | 5359*2^13435030+1
58143177309443 | 55459*2^18929614+1
58143725709331 | 4847*2^16345647+1
58152700125841 | 22699*2^5266198+1
58153361051249 | 67607*2^18849011+1
58130163354677 | 55459*2^5984698+1
58130502895411 | 33661*2^5974872+1
58130637894977 | 28433*2^14473633+1
58133832667853 | 21181*2^9694100+1
58139256462389 | 24737*2^12631087+1
58139814210671 | 24737*2^19963543+1
58143005430539 | 5359*2^13435030+1
58143177309443 | 55459*2^18929614+1
58143725709331 | 4847*2^16345647+1
58152700125841 | 22699*2^5266198+1
58153361051249 | 67607*2^18849011+1
07-18-2003, 04:21 PM
Moo_the_cow

Keroberts, I noticed that too - whenever a big gap occurs between factors, around 3-4 factors are clumped together after that. I thought that it was just a strange coincidence, though.
07-18-2003, 11:02 PM
Keroberts1

It wasn't just clusters following gaps I believe that there may be some consistancy to the pattern of the sizes of the gaps. Perhaps the sieve client may be able to intelligently skip some ranges. This may not be a good idea at the time since we can still remove factors at a fast enough pace so that it is very much worth while to keep sieving. However once factor density drops drastically in the upper ranges it may be better to risk missing some factors in exchange for the quick retrieval of a few. Its just speculation on my part I don't know nearly enough about computers or number theory to try to impliment any of this. It could be a real blesing for the project though.

Show 50 post(s) from this thread on one page