Page 2 of 18 FirstFirst 12345612 ... LastLast
Results 41 to 80 of 709

Thread: Sieve Client Thread

  1. #41
    Junior Member
    Join Date
    Jan 2003
    Location
    Wuerzburg, Germany
    Posts
    3
    Hello,
    I use SoBSieve 1.12 and screen output only goes to 781 factors then it stops. In SoBStatus.dat all factors are stored, also the 782nd, 783rd and so on. Is this a less significant bug? Perhaps some people think, they would not compute more factors.

    Sorry for my English, but it's late and I have to go to bed now!

    Martin

  2. #42
    New NbeGon runs at ~64kp/sec on my 1gHz Athlon Thunderbird (d/alpha=3.2). SoBSieve ran ~20kp/s with same alpha, and all I did was start NbeGon from where I stopped SoBSieve (actually to make sure I don't miss anything I start it from last factor found instead of where it says it stops).

    And also this scared me a lot so I want to make sure it doesn't get anyone else. If a blank line somehow gets into the submission form, it will ring up as an invalid factor. So if you get an invalid check the beginning and end of the submitted factors line for extra commas. Mine started with a comma so that's how I knew that a blank line slipped into the front.

    Also, the NbeGon 010 uses 27k memory for me, so I don't know what's going on with yours.

  3. #43
    Moderator ceselb's Avatar
    Join Date
    Jun 2002
    Location
    Linkoping, Sweden
    Posts
    224
    Originally posted by RangerX
    New NbeGon runs at ~64kp/sec on my 1gHz Athlon Thunderbird (d/alpha=3.2).
    Maybe you've already done this, but try a higher d.

    Originally posted by frmky
    On my Athlon XP 1800+, the optimal d parameter is -d=4.5

  4. #44
    Originally posted by frmky
    On my Athlon XP 1800+, the optimal d parameter is -d=4.5, and there it checks 97000 p/s. I'm sure someone with a faster Athlon will report your 100,000 p/s number.

    Greg
    I had an interesting brainwave today - I'll see if I can push you over the limit. I'm hoping I can push Louie's 88000 over too. 12% shouldn't be too hard (famous last words). This idea also splits the code into smaller, more manageable units (do one job at a time, and do it well).
    Actually I had several brainwaves - one of which was to use the -march=athlon switch to gcc. That made it 20% slower!

    Sorry about the P4s - they'll almost certainly need assembly. David Underbakke, the coder behind the sieve for the Yves Gallot's GFN project, has some very fast 64-bit modular maths libraries for the P4, and they'll possibly be the cure.

    Phil

  5. #45
    Originally posted by ceselb
    Maybe you've already done this, but try a higher d.

    When ran SoBSieve (where I could see a visible rate) I played around with the values. 3.1 gave me somewhere around the 17000's. Same with 3.3. 3.2 Gave me 19000's. I just assumed that the equations only had one peak and that was it. Is it possible to have multiple peaks so that higher alpha's might yield something higher?

    Also phil, is it advisable to restart NbeGon every once in a while to reset the overflow numbers? Right now mine's at 13576|3|1|0.

  6. #46
    Sieve it, baby!
    Join Date
    Nov 2002
    Location
    Potsdam, Germany
    Posts
    959
    AFAI get it, an overflow only has an impact on the very prime it tries. So the count is basically irrelevant for the speed and has tweaking purposes only.
    Phil should know best, though.

    Otherwise, there's no downside in stopping the client. If you want, you can copy the value of the last p and insert it into the batch file to get those seconds to few minutes since the last checkpoint, too.

  7. #47
    Member
    Join Date
    Dec 2002
    Location
    new york
    Posts
    76
    3.2 was indeed a good alpha for SoBSieve, but it's on the low side for NbeGone. I just did a bunch of tests and found that the whole range of d=3-6 is pretty similar, but d=4-5 is the sweet spot. This is on an Athlon, and it's independent of the size of p.

    Another thing: sieving goes faster as p gets bigger (!)

  8. #48
    I don't know if it will improve people's speeds much, but a new sieve file is online bundled with the newest SoBSieve (v1.12).

    It has all numbers currently submitted removed... so all p < 25 billion (my initial range) should be gone along with anything else people had submitted up to about 10 minutes ago.

    The link is updated on the sieve page but here's a direct link to the new stuff:

    http://www-personal.engin.umich.edu/~lhelm/sieve112.zip

    This takes the sieve file from ~815000 n values to 746961 values. This is a linear reduction of 8.3%. However, I believe the speed of the sieve is related to the sqrt(n) so that reduction is only:

    903 --> 864

    Giving a reduction of around 4.3%. I haven't done benchmarking, so let me know if you get the theoretical ~5% speedup from using the new sieve file.

    Happy cruching!

    -Louie

  9. #49
    Originally posted by jjjjL
    This takes the sieve file from ~815000 n values to 746961 values. This is a linear reduction of 8.3%. However, I believe the speed of the sieve is related to the sqrt(n)
    ...
    The n in that is the whole range, not the number of candidates remaining. So it won't speed up anything apart from the file load part of the restart times. The only algorithm that depends on the number of candidates is a linear one, and therefore very expensive for these large ranges.

    Round up of other questions:
    Oh - 010 did in theory have the 'G' abbreviation for 000000000 in the p range. I'm not sure I tested it though!

    Regarding speeding up as p increases, that's a feature of there being fewer p to test. The density is ~1/log(p). So a chunk near 10^11 would have 10/11-ths of the number of primes in the same sized chunk near 10^10.

    Overflow printouts - I'm pretty sure that no prime will ever overflow the full 6 times, and therefore I have nothing to worry about. Maths proves this to be a false assumption though, and in theory there will be a few annoying primes that slip through.
    I might just shut the printouts up until there's been a 5th overflow. Given that people have faster machines than I, I will be slowing down the print statements in the next version.

    Unpredictable dependency of speed on the dimension parameter - that's confusing. The working set for the DLOG is actually the same size whatever the dimensions are, as I use a hash-table, and I assume that the whole thing is fighting for the cache, and so cache-abuse should be the same whatever the dimensions are. My next tweak will hopefully decrease cache-abuse by decreasing the working set size. Maybe that will make the timings smoother. Fighting cache can be an unpredictable business, particularly in these scatter-write/gather-read scenarios. I think the P4 has a small L1 data cache, doesn't it? If so, hopefully P4s have more chance of seeing an improvement from this next change.

    Phil

  10. #50
    Senior Member
    Join Date
    Jan 2003
    Location
    U.S
    Posts
    123
    Does anyone know the best alpha value to use for SoBSieve
    v.1.12? I think it's 1 or 3.2, but it may change for different
    versions.

  11. #51
    ahh, i had a feeling the speed dependence might be on the range of n and not the # of candidates. in that case, the sieve file will probably not be updated too often.

    -Louie
    Last edited by jjjjL; 01-26-2003 at 04:58 AM.

  12. #52
    Moderator ceselb's Avatar
    Join Date
    Jun 2002
    Location
    Linkoping, Sweden
    Posts
    224
    Originally posted by jjjjL
    the sieve file will probably not be updated too often.
    Well, the sieve should take a little less RAM now and that's a good thing imho. Perhaps another update when 1T is mostly done?

  13. #53
    Sieve it, baby!
    Join Date
    Nov 2002
    Location
    Potsdam, Germany
    Posts
    959
    Originally posted by FatPhil
    Overflow printouts - I'm pretty sure that no prime will ever overflow the full 6 times, and therefore I have nothing to worry about. Maths proves this to be a false assumption though, and in theory there will be a few annoying primes that slip through.
    Well, I actually had one of those yesterday or the day before...
    I guess it was the 009 version with a low alpha value (0.8 - which was the best setting I found). With 010, hash overflows (at speed optimal d value) have reduced much.

  14. #54


    To Paul.

    Are some of the performance gains also applicable for other calculations made with newpgen and will there be a new version in the months to come?

    Keep up with your work.

    Lars
    www.Rechenkraft.net - most comprehensive german website about distributed computing projects

  15. #55
    What do you mean by overflow the 6 times? Right now my client is showing 3653|74|2|1|1|1. Is any of that bad?

  16. #56
    Moderator ceselb's Avatar
    Join Date
    Jun 2002
    Location
    Linkoping, Sweden
    Posts
    224
    Below quoted from a mail from Phil (Bold is me).

    Is the proportions of "hash overflows: 9388|154|1|1|0|0" in any way
    related to dimension (and performance)?


    Yes. The more 'big steps' in the DLOG, the more likely my hash table is to
    overflow. If the big step stage is faster then I want to make it as large as
    possible. However, if I overflow the hash table I need to that prime again
    with fewer big steps. Therefore an overflowed prime costs at least one and a
    half times a normal prime, time-wise. So the balance is makign the first
    attempt small enough such that it's as fast as possible, but large enough
    that it doesn't overflow too often. Overflowing 1 in 1000 primes is nothing.
    Multiple overflows chose progressivly fatter rectangles with fewer big
    steps, and incur a further cost.

    The balance is such a compromise that the dependency of speed on the
    dimension parameter is reduced. If you chose a parameter too small, then you
    get faster DLOGs, but you overflow more often, so you have to redo more often.
    Likewise the other way. So it's pretty flat performance-wise on the whole.

  17. #57
    Member
    Join Date
    Sep 2002
    Location
    London
    Posts
    94
    I guess that last 1 means that you have overflowed more that six times.

    Yours,

    Nuutti

  18. #58
    Originally posted by FatPhil
    I had an interesting brainwave today
    Which didn't work. There's no point in me gnashing if I have no new ideas to chew on, so I've sent the 010 code, explanation and a few musings to Paul. I hope that he can either provide some new insights for new revolutions, or at least implement my algorithm more efficiently in assembly for a simple faster evolution.

    Oh, 'G' doesn't work! (due to a lack of a "++e;").

    Phil

  19. #59
    Originally posted by FatPhil
    The n in that is the whole range, not the number of candidates remaining. So it won't speed up anything apart from the file load part of the restart times.
    So there's no reason to update the sieve file?

    Originally posted by FatPhil
    The only algorithm that depends on the number of candidates is a linear one, and therefore very expensive for these large ranges.
    So it's benifical to update the sieve file?



    -Louie

  20. #60
    Originally posted by jjjjL
    So there's no reason to update the sieve file?
    Indeed.

    The algorithm that depends on the number of candidates is the one we don't use! (it's effectively just glorified trial-division.) Pretty useless for a situation like this, forget I mentioned it.
    (I'm not telling the whole story, but it's long, complicated, and wouldn't change the way we'd do things, so it's best left forgotten.)

    Sieve-file download time, and client startup time are pretty much the only things that are affected. And they're both itsy-bitsy.

    If you don't like duplicates (composites removed by two or more primes), then you can reduce the number of them by having people sieve the latest file, but I can see no reason to be afraid of duplicates.

    Psychologically, it might be good for the sievers to see the fruits of their (combined) labour by having them work on ever-decreasing sieve-files. I know I'd prefer to see the files shrink by a couple of thousand lines each time I started a new range. However, I've got a fast SDSL, and download times are irrelevant.

    Phil

  21. #61
    Sieve it, baby!
    Join Date
    Nov 2002
    Location
    Potsdam, Germany
    Posts
    959
    However, I've got a fast SDSL, and download times are irrelevant.
    Those who mind downloading a new sieve file every now and then don't have to...

    I think it's best to make an update from time to time - though it's not that critical.

  22. #62
    Senior Member
    Join Date
    Jan 2003
    Location
    U.S
    Posts
    123
    2 questions:

    1.) Have there been any duplicates?
    2.) If so, what is the chance of a factor being a duplicate?

  23. #63
    Sieve it, baby!
    Join Date
    Nov 2002
    Location
    Potsdam, Germany
    Posts
    959
    Assuming equipartition (which I think it is), the chance that a factor is a duplicate should be "<count of submitted factors>" : 746961 <-- as 746961 is the current amount of candidates in the sieve file.

  24. #64
    Member
    Join Date
    Dec 2002
    Location
    new york
    Posts
    76
    With 12 k's and 17M n's, that's 200M candidates. How did we get down to 750K so quickly? Sieving?

  25. #65
    Originally posted by Moo_the_cow
    2 questions:

    1.) Have there been any duplicates?
    2.) If so, what is the chance of a factor being a duplicate?
    1) Absolutely. While I was soak-testing 010 before release I noticed that had didn;t output (one factor) from Paul's SoBSieve. The maths told me I was right to remove the candidate, so why didn't Paul... because the Sob.dat in the directory where his exe was was more recent than the one where mine was - that value had already been removed! And that was in the space of only a fraction of a billion.

    2) On the way from 800000 to 750000 candidates, 50000 need to be removed. If you can select 50000 random numbers had have none of them identical, then you're cheating. Many numbers will be removed multiple times. I'd be surprised if there weren't over 55000 submissions in order to remove 50000 numbers.

    If the central filter file is down to 700000 candidates, and you know that you are working on a file that has 800000 candidates, then your effective sieving rate is 7/8ths of what it looks like, as you can be sure about 1/8 of what you find will already have been found.

    Phil

  26. #66
    Originally posted by dudlio
    With 12 k's and 17M n's, that's 200M candidates. How did we get down to 750K so quickly? Sieving?
    Yes!

    These families are deliberately chosen with properties that mean they sieve down to almost nothing. Basically, vast proportions of the candidates have very small divisors, and simply by sieving with the first dozen or so primes you will throw out 96% of the candidates! (a more normal set of numbers would throw out about 80%, say).

    Let's look at k=22699 (the most extreme)

    > for(i=1;i<=720;++i){printf("%d %d\n",i,factor(22699*2^i+1));}
    1 3
    2 7
    3 3
    4 5
    5 3
    6 11
    7 3
    8 5
    9 3
    10 17
    11 3
    12 5
    13 3
    14 7
    ...

    Just by sieving with 3,5,7,11,13,17 and 19 the only candidates that will remain will be those with n={70, 118, 142, 190, 214, 262, 334, 358, 430, 478, 502, 550, 574, 622, 694, 718} mod 720. i.e. 16/720 remain, or 2.22%!

    And this high density in composites is exactly the reason why they have such low density in primes (to whit currently none are known).

    Phil

  27. #67
    Sorry for the rather simplistic question, but basically we're sieving to remove easily-found composites so that we don't have to check them for primality (is that a word?) later?

  28. #68
    Member
    Join Date
    Dec 2002
    Location
    new york
    Posts
    76
    basically we're sieving to remove easily-found composites so that we don't have to check them for primality (is that a word?)

    Yes, and yes Sieving works in the opposite direction from prp (primality) testing. In prp'ing, we do k*2^n+1 = result and see if the result is prime. In sieving, we start with a bunch of possible results (actually, factors of results) and solve for n.


    Just by sieving with 3,5,7,11,13,17 and 19 the only candidates that will remain will be....2.22%!

    Awesome. Thanks Phil

  29. #69

    SoBSieve 1.20

    Hi all,

    This is a very quick note, as I have been busy over the last four days as my partner has been in hospital for surgery on her knee. In between visiting the hospital, I did manage to find a few hours to do some improvements to SoBSieve.

    First of all, I moved the main body of the code into assembler, which gave a 50% improvment in speed. That still left me slower than Phil, which I couldn't understand as I couldn't see what else could be done. However I was missing a *very simple* trick in the code, which Phil passed on to me (thanks Phil! ). It only took about half an hour to implement a naive version and test it, and the end result of all of this activity is that this version is well over twice as fast as 1.12.

    I still have some work to do - the assembler code needs to be profiled to get it as fast as possible, and I should really implement a smarter version of the trick - as I am sure Phil will be as well.

    I have given it a cursory test and found no problems, but if somebody can give it a quick parallel run and report back to the group that would be useful (I would, but I have got to get to the hospital). I do not expect any problems, but nothing is certain in software development.

    Regards,

    Paul.
    Attached Files Attached Files

  30. #70

    Re: SoBSieve 1.20

    Originally posted by paul.jobling
    well over twice as fast as 1.12.
    It sounds like you're still at my 009 tech. Unrolling by 2, which was my 009-010 speedup, gave another 50%. The C I gave you _should_ translate 1<->1 into assembly, once you've mapped variables onto mobile stack entries (bletch, which is why I use C rather than asm).

    Phil

  31. #71
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479
    AMD XP 2100+, Win2K:

    NbeGon_010 (alpha = 4.5) ...... 112Kp/s

    SoBSieve 1.20 (alpha = 1.5) ..... 102Kp/s

    (both using p=670G)

  32. #72
    It sounds like you're still at my 009 tech. Unrolling by 2, which was my 009-010 speedup, gave another 50%. The C I gave you _should_ translate 1<->1 into assembly, once you've mapped variables onto mobile stack entries (bletch, which is why I use C rather than asm).
    Ah, well my C code and your C code were doing the same thing, but I understand mine! I'll just stick with the convertion of my code into assembler, as I know what it is up to, and I have already done it. But the two programs were probably doing exactly the same thing but in slightly different ways - you hash is completely different to mine, for example. Now that I have got that last piece of the puzzle I can start profiling the code....

    Cheers,

    Paul.

  33. #73
    Senior Member
    Join Date
    Jan 2003
    Location
    U.S
    Posts
    123
    Paul, thanks for the new version

    However, with that version, my rate skyrockets to 40,000 p/sec
    when I start the client, but falls to 35,000 p/sec a few minutes
    later. Should I just keep restarting the client every half an hour
    or so to improve production, or is this just a bug in the client?
    Have any other SoBSieve v.1.2 users had this problem?

    P.S.

    "Sieving a G a day keeps many PRP tests away"

    Last edited by Moo_the_cow; 01-27-2003 at 04:19 PM.

  34. #74
    duplicates are pretty common. of the numbers left in the database,

    714883 have no factors (yet )
    181629 have exactly 1 factor
    5987 have 2 known factors
    232 of them have 3 known factors
    6 of them actually have 4 known factors

    this excludes all the numbers below 1G which were never placed on the server. in other words, the sum of all the above should be about equal to the number of candidates left after sieving from 0 - 1 G.

    4847?2^19202151+1 is one of the highly composite numbers...

    120435469129 | 4847?2^19202151+1
    15608701133 | 4847?2^19202151+1
    34133798287 | 4847?2^19202151+1
    224192486249 | 4847?2^19202151+1




    on another totally seperate note, i was thinking about was p-1 factoring...

    phil, paul, do either of you have an intelligent opinion about the viablity of doing this? as prp test times increase, the benefit of doing it seems to increase too. i was toying with making it into a java applet the other day. i knew it would run slow-ish but i was thinking about how it would be cool if there were a webpage you could just go to that would start doing work for the project on *ANY* computer you were on w/o having to download and install software or even be on a system capable of running SB/SoBSieve. my java app idea is a little ambitious and mostly academic, but it would be interesting. anyone interested?

    -Louie

  35. #75
    However, with that version, my rate skyrockets to 40,000 p/sec when I start the client, but falls to 35,000 p/sec a few minutes later
    This is probably just an artifact of the rate sampling. I see someting similar; the first couple of rate reports are always higher than the steady state.

    AMD XP 2100+, Win2K:

    NbeGon_010 (alpha = 4.5) ...... 112Kp/s

    SoBSieve 1.20 (alpha = 1.5) ..... 102Kp/s
    Ah, that is the same system as I have here, so I can confidently report that the attached SoBSieve 1.21 will run at 129Kp/s at 670G with alpha=2.5 (note the new value!)

    Phil - this speed up came from doing the last improvement properly - ie for each k, we find a suitable multiplier, then rather than multiplying by 16 (as at the moment), it multiplies by 2^48, 2^24, etc, as approriate.

    on another totally seperate note, i was thinking about was p-1 factoring...
    Well, we are currently pulling out 12 digit factors very quickly through trial division - and we are looking at a whole load of number all together. For p-1, though, you would be looking at one number (eg 4847*2^14342252+1) at a time. So this should be compared to the PRP testing time; but remember that all it can do is either eliminate a number with a small factor; if it doesn't find a small factor the number has got to then be PRPed anyway.

    Hmmm - I think investigation is required - I wonder if it would be worth putting x% of the PRP time aside to do this? It is a win if >x% of numbers are proven composite through the p-1 test...

    Regards,

    Paul.
    Attached Files Attached Files

  36. #76


    On my computer Sobsieve again runs the fastest :->

    You say you get 129K; don't let you distract by this big number!! Think of it as small, I am sure if you take 200k as "the goal" , you can do it .


    Sieving is so important!! I have to run my computer for ten days,
    just to test one number. So as long as you sieve more than 1 number/day it is worth it!

    Keep up the good work

  37. #77
    Originally posted by ltd


    To Paul.

    Are some of the performance gains also applicable for other calculations made with newpgen and will there be a new version in the months to come?

    Keep up with your work.

    Lars
    Hi Lars,

    Yes is the quick answer- as far as the fixed-k sieves go anyway. There is a new version in beta test right now that doesn't include all of the changes that I have made in the last week or so, but is still significantly faster than the previous release.

    You say you get 129K; don't let you distract by this big number!! Think of it as small, I am sure if you take 200k as "the goal" , you can do it .
    I have been optimising the assembler this morning, in between tending to Karen's every (k)need, and now SoBSieve manages 148K p/s. To go significantly faster would almost certainly need a new algorithmic or mathematical insight.

    I will probably be posting the next version at 8pm GMT (is the time in GMT for everybody?). Hopefully I can get a little more out of it, but don't hold your breath.

    Regards,

    Paul.

  38. #78
    Well, this is version 1.22. It is 8% faster that 1.21 on this PC, and unless I have a really good idea or I find a major optimisation, this will be the last release for a while.

    Regards,

    Paul.
    Attached Files Attached Files

  39. #79
    When i start the client this message appears:


    WARNING: 64 bit integers are badly aligned as qwords!

    but it seems to keep calculating, so i guess nothing to worry about.

    I have PIII 450 192mb ram; win 98; alpha 2.5
    tested >=200G

    btw I think I get >8% faster!! great job


    cheers

    Ola

  40. #80
    I get the same as olaright, but it stops processing.

    The application quits saying that "An attempt was made to execute an illegal instruction (0xc000001d) occurred in the application at location 0x00401618"

    add ecx,dword ptr [ebp-0Ch]
    adc edi,dword ptr [ebp-8]
    cmovb eax,ecx <-- failure point
    cmovb edx,edi

    I have a AMD K6-2 at 450 MHz with 256 MB ram.

    --Mark

Page 2 of 18 FirstFirst 12345612 ... LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •