Results 1 to 31 of 31

Thread: download factors > 20M

  1. #1
    Senior Member
    Join Date
    Jun 2005
    Location
    London, UK
    Posts
    271

    download factors > 20M

    Is there I can download the factors (like the results.txt.bz2 file) that includes factors for n > 20M? Both the results.txt.bz2 and results_marked_excluded_whatever.txt don't have any factors over 20M.

    Want to see how much has been removed from the dat file (I know it won't affect the speed yet!). Just interested.
    Quad 2.5GHz G5 PowerMac. Mmmmm.
    My Current Sieve Progress: http://www.greenbank.org/cgi-bin/proth.cgi

  2. #2
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479
    Quote Originally Posted by Greenbank
    Is there I can download the factors (like the results.txt.bz2 file) that includes factors for n > 20M? Both the results.txt.bz2 and results_marked_excluded_whatever.txt don't have any factors over 20M.

    Want to see how much has been removed from the dat file (I know it won't affect the speed yet!). Just interested.
    Actually the result.txt.bz2 does have some, I'm just not sure how much.

    e.g. n=42255456, p=55002447571679

    is one of many that get spat out by my sieve scoring software.

    If anyone (e.g. Joe) has any info on exactly what the database was initialised with for 20 - 50M (i.e. what sieve depth), I'd be very interested to know.

    Cheers,
    Mike.

  3. #3
    Senior Member
    Join Date
    Jun 2005
    Location
    London, UK
    Posts
    271
    7326 of them in results.txt.bz2 that would remove entries from SoB.dat but that's not all of them.
    Quad 2.5GHz G5 PowerMac. Mmmmm.
    My Current Sieve Progress: http://www.greenbank.org/cgi-bin/proth.cgi

  4. #4
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Greenbank,

    If you have access to the yahoo high-n sieve group. A while back I made some graphs and table showing how many factors were removed.

    As for speed increases due to the number of k/n pairs removed... I don't remember if there was much of a increase in speed if any. I seem to recall the sieve of the first few T were very very slow 10's of kps on a 500 kps CPU. The speed increased, but I'm not sure if that was due to the decreased dat size or the affect of large p's or p denisty. I do know that the memory requirements were decreasing every time we removed n's. Removing k's obviously increases the sieve speed.

    I have all of the factors for 20<n<50M and factrange for n>50M but it's several 100Mb of factors, pretty hard to e-mail. I actually snail mailed a cd to Joe at one point.

  5. #5
    i believe an interesting question would be how much would removing all tests that have been double checked already would speed up the sieving process. if we raised the lower bounds would we be able to remove factors faster, sure the low N factoring has been fun once there are enough tests done below 1,000,000 i doubt there will be more interest in finding any more. People really want to find primes adn this is just a sidetrack for a little bit so evenrtually i believe we will be removing all tests that have matching residues reported. How will this affect the sieve speed? Is it worth having a .dat for those who are not interested in low n factoring? Sieve is very unlikely to find these factors and another .dat can easily be maintained ot keep track of these old factors incase someone ever wonders what they are for say triple check purposes.

  6. #6
    Senior Member
    Join Date
    Jun 2005
    Location
    London, UK
    Posts
    271
    Doesn't really help, plus I (personally) don't like it for several reasons.

    I split the .dat file into two, one (low.dat) for n < 10M and (high.dat) for n >= 10M.

    high.dat runs only 4% faster than the usual SoB.dat.

    Meanwhile low.dat is only 23% faster than the usual SoB.dat.

    So if you split the sieving in two it is much less efficient.

    Plus you are now missing possible factors. I know they are for PRP tests that we have matching residues for but you can't argue with x divides k*2^n+1.

    Also, we still have outstanding tests way down at n=3.7M that haven't had double-check results confirming them.
    Quad 2.5GHz G5 PowerMac. Mmmmm.
    My Current Sieve Progress: http://www.greenbank.org/cgi-bin/proth.cgi

  7. #7
    I'm only talking about removing canidates that are never going ot be tested again by PRP. This is to say the factors that are found are useless. I say after second pass passes a n value cut it out ofthe testing. Its not prime let it go. Then concentrate the brute of the resources on something else. Sure i thought it would bemore than 4% adn maybe thats not worth the effort but what real reason is there ot keep testing below the second pass line?

  8. #8
    Senior Member
    Join Date
    Jun 2005
    Location
    London, UK
    Posts
    271
    Quote Originally Posted by Keroberts1
    I'm only talking about removing canidates that are never going ot be tested again by PRP. This is to say the factors that are found are useless. I say after second pass passes a n value cut it out ofthe testing. Its not prime let it go. Then concentrate the brute of the resources on something else. Sure i thought it would bemore than 4% adn maybe thats not worth the effort but what real reason is there ot keep testing below the second pass line?
    Because it's almost free and removing them would make almost no difference to the speed.

    The speed of proth_sieve is not related directly to the number of remaining n values but to the residue classes of the remaining n.

    To put it another way, if you 10% of the n-values in the current .dat file the speed of the new .dat file would, with a 99% probability, remain exactly the same.

    The speed-ups come from removing whole k values (when a prime is found) or if all n-values in one residue class are removed. The hint is in the output if you run proth_sieve with the -vv options:-

    10223: mod 96: 5 29 41 77 89
    19249: mod 720: 122 158 266 302 338 446 482 626 698
    21181: mod 264: 20 68 92 116 188 212 236 260
    22699: mod 360: 118 190 262 334
    24737: mod 120: 7 31 103
    33661: mod 360: 0 72 96 144 168 216 240 288 312
    55459: mod 180: 46 58 94 118 130 154 166
    67607: mod 360: 27 131 171 251

    So there would be a slight speed up if any of these residue classes were removed. However this requires finding factors for, or PRP testing, hundreds of numbers over the full 991 <= n <= 50M range.
    Quad 2.5GHz G5 PowerMac. Mmmmm.
    My Current Sieve Progress: http://www.greenbank.org/cgi-bin/proth.cgi

  9. #9
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    As far as I can see, there is only 60T of unreserved ranges left below 2^50. even if the speed-ups were much higher, I would still doubt the usefulness of splitting the dat file at this stage.

    Moving to 20m-50m dat file would be considered when we finish up to 2^50 and start filling the holes (the ranges that were not sieved with 991-50m dat) though. I guess there is more than 2^49 size ranges (600T-700T?) untouched by 991-50m, right? Moving to a 30m size dat would probaly bring a speed increase of roughly 6-7%. On the other hand, these ranges were alerady sieved with x-20m dat, so what we might lose there is only limited to some forgotten factors (which is definitely only a handful probably) due to human error.

  10. #10
    Sieve it, baby!
    Join Date
    Nov 2002
    Location
    Potsdam, Germany
    Posts
    959
    May I motivate the combined sieving with PSP?

    There, the n<50M sieving is currently at the same p as SoB. 15-20T have already been sieved for both projects at the same time, which is more efficient than doing the sieving separately. Especially in the last few months, some very ressource-providing members joined PSP sieving, hence also SoB will likely profit from this joining.

  11. #11
    are any of the mod groups for proth sieve smaller than the others? perhaps if one was small enough we could set aside some resources ot eliminate it. this would probably be mostly factoring but perhaps some testing too if the values are low enough.

  12. #12
    Senior Member
    Join Date
    Jun 2005
    Location
    London, UK
    Posts
    271
    The smallest mod group is for k=19249, n = 626 mod 720 with 6461 n values. Way beyond what we could try and eliminate by PRPing. If you consider that the n values are spread amongst the entire 991->50M range it would take many many CPU years to eliminate them.
    Quad 2.5GHz G5 PowerMac. Mmmmm.
    My Current Sieve Progress: http://www.greenbank.org/cgi-bin/proth.cgi

  13. #13
    Senior Member
    Join Date
    Jun 2005
    Location
    London, UK
    Posts
    271
    Quote Originally Posted by Nuri
    As far as I can see, there is only 60T of unreserved ranges left below 2^50. even if the speed-ups were much higher, I would still doubt the usefulness of splitting the dat file at this stage.

    Moving to 20m-50m dat file would be considered when we finish up to 2^50 and start filling the holes (the ranges that were not sieved with 991-50m dat) though. I guess there is more than 2^49 size ranges (600T-700T?) untouched by 991-50m, right? Moving to a 30m size dat would probaly bring a speed increase of roughly 6-7%. On the other hand, these ranges were alerady sieved with x-20m dat, so what we might lose there is only limited to some forgotten factors (which is definitely only a handful probably) due to human error.
    If people want to get big scores from sieving they should sieve a new range with the 991 to 50M dat file. I believe that Chuck/Joe_O are close to releasing their new 32-bit client that breaks the 2^50 barrier, and is also much faster!

    Why worry about a 6% or 7% speed increase when they new client will give you more than that.

    If people want to contribute, but aren't that worried about stats and scores they should consider the "second-pass" sieving to fill in the gaps. Where possible they should use the combined SoB+PSP dat file.

    I have 2 x 2GHz Athlons sieving for SoB just below 2^50. And my Quad PowerMac G5 is sieving an combined SoB/PSP 8T range (125T to 133T).
    Quad 2.5GHz G5 PowerMac. Mmmmm.
    My Current Sieve Progress: http://www.greenbank.org/cgi-bin/proth.cgi

  14. #14
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Sorry I didn't contribute to this discussion earlier.

    Many of the points were already covered so I'll just comment.

    Probably over a year ago now, Joe and myself did some exhaustive studies on dat file size and how the n-range relates to sieve speed. The 991<n<50M dat size was determined to be the most efficient, I believe the exact optimal point was somewhere around a 53M range. If you want to go back throught he graphs for the low n second pass sieve thread it's in there, along with a explaination as to why. In brief there is an optimal range between 45M-55M much greater than 55M the efficiency drops dramatically. Below 45M the speed gains are off set by efficency losses.

    A rough example would be the difference between a 50M dat we currently have and the previous 19M 1M<n<20M dat we had from before. Moving to the 50M dat only reduced the speed by <20% something more like 15%.

    Technically we could reduce the dat size and yes it would be faster, but even the most drastic reduction say a 7M<n<50M dat wouldn't yeild more than a few %'s of sieve increase. (If even noticable) and of course we have all those uncompleted tests less than 7M.

    To comment on Nuri's point of the unsieve ranges n>20M.

    Yes they do exist and we probably won't go back and check those ranges with a 20M<n<50M dat. Reason being psp-sob combined sieve will probably surpass that level before our n reaches 18M. Second we will probably eliminate a few k before that point as well.

    Currently and for the forseeable future we will continue with a 991<n<50M sieve range. Joe/Chuck client will also be out before we complete everything less than current proth as well. I've been using it for a few weeks now >1126T for those of you who havn't noticed. The client is great however the version I have is not significantly faster. But it's more stable with less pagefaults and a decent memory footprint. I've tested the same range with different clients and computers all of which produced the same results needless to say the client is no longer a dream. Joe and Chuck will make the release when they are ready.

  15. #15
    Last but not least a reson for maintaining the status quo is that every change in dat files causes confusion, leading to human error caused factor losses.
    And the good point with the 0-50M dat is that one can easily put it together with the PSP file.
    So, for the next year, at least, I see a friendly coexistence of SoB firstpass sieve and PSP SoB joint sieve.
    H.
    ___________________________________________________________________
    Sievers of all projects unite! You have nothing to lose but some PRP-residues.

  16. #16
    Senior Member
    Join Date
    Jun 2005
    Location
    London, UK
    Posts
    271
    Indeed, I'm working on two fronts:

    My 2 Athlons are helping in the effort to finish off the p < 2^50 range for SoB. (~57T left to assign, more to report back).

    My Mac is doing combined sieving for SoB and PSP.

    PSP's sieving speed would have dropped (as they are now doing combined sieving to help us) so we (as SoB sievers) need to help them out by doing some of their first-pass sieving.

    Of course the choice is yours as to whether you do SoB first-pass or combined SoB+PSP.
    Quad 2.5GHz G5 PowerMac. Mmmmm.
    My Current Sieve Progress: http://www.greenbank.org/cgi-bin/proth.cgi

  17. #17
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Question for you Greenbank,

    At this point how much of the <100T p has psp sieve with the 991<n<50M dat is it all upto 87T since I don't seen the two threads anymore.

    Also what is the speed difference between the sob only, psp only and combined dat. I thought the combined dat was only 15% slower compared to the psp only dat, is this still true? How about SoB compared to combined.

    I agree that secondpass SoB sievers should strongly consider using the combined dat since PSP hasn't firstpassed these ranges of p yet.

  18. #18
    @vjs: I don't quite understand your question but try to answer.
    The reservation thread is here:
    http://www.mersenneforum.org/showthread.php?t=2666

    PSP joint sieving started at 85T; ranges below are being sieved PSP only at this very time. The lowest available p is 109T.

    As for the speed increase of PSP and PSP/SoB, your value (15%) coincides with my last information.
    My guestimate for SoB to PSP/SoB value is <<30%; but I actually don't know.
    PSP only ranges are not going to be assigned again before a long time.

    At the very beginning, there were two threads, but PSP adopted the joint sieve policy soon, so the old thread was erased.

    As for the SoB secondpass sievers, there are not so many of them, if you don't count Greenbank, who donates heavy sieve power. So let's encourage people to head over to PSP and grab a range.
    H.
    ___________________________________________________________________
    Sievers of all projects unite! You have nothing to lose but some PRP-residues.

  19. #19
    Senior Member
    Join Date
    Jun 2005
    Location
    London, UK
    Posts
    271
    Quote Originally Posted by vjs
    Question for you Greenbank,

    At this point how much of the <100T p has psp sieve with the 991<n<50M dat is it all upto 87T since I don't seen the two threads anymore.
    http://www.mersenneforum.org/showthread.php?t=2666

    Ranges under 100T sieved that are marked as combined:-

    85000-88000 completed by ltd
    90000-95000 completed by hhh and ltd

    From looking at the SierpinskiSieve yahoo group it looks like everything under 100T has been assigned (and has probably been returned.) There are a couple of ranges on the Yahoo group (including 83T-85T assinged to myself) marked as Reserved instead of Complete. I hope I submitted them, they have been completed.

    Quote Originally Posted by vjs
    Also what is the speed difference between the sob only, psp only and combined dat. I thought the combined dat was only 15% slower compared to the psp only dat, is this still true? How about SoB compared to combined.
    These are the speeds for my G5:-

    SoB.dat: p = 125000010000011 @ 1129kp/s
    psp.dat: p = 125000010000011 @ 858kp/s
    combined.dat: p = 125000010000011 @ 624kp/s

    For a 2GHz Athlon XP 2400+ using proth_sieve v0.42:-

    SoB.dat: p = 125000010000011 @ 499kp/s
    psp.dat: p = 125000010000011 @ 393kp/s
    combined.dat: p = 125000010000011 @ 229kp/s

    Taking the Athlon speeds.

    Time to sieve 1G of SoB: 10^9/499000 = 2004 secs
    Time to sieve 1G of PSP: 10^9/393000 = 2544 secs
    Tiem to sieve 1G of both: 10^9/229000 = 4366 secs

    So, time to sieve SoB then PSP: = 2004 + 2544 = 4548. So combined sieving saves you about 4%.

    But with an optimised client like my G5 version:-

    1G SoB: = 10^9/1129000 = 885
    1G PSP: = 10^9/858000 = 1165
    1G Both: = 10^9/624000 = 1602

    So, time to sieve SoB then PSP: = 885+1165 = 2050. Combined sieving saves me about 22%.

    Quote Originally Posted by vjs
    I agree that secondpass SoB sievers should strongly consider using the combined dat since PSP hasn't firstpassed these ranges of p yet.
    Yes. Please do!

    OK, so now we just rely on reservations being made over at the PSP forum and for some kind SoB sievers to donate some computing power to sieving a combined range from over there!
    Last edited by Greenbank; 03-29-2006 at 11:49 AM.
    Quad 2.5GHz G5 PowerMac. Mmmmm.
    My Current Sieve Progress: http://www.greenbank.org/cgi-bin/proth.cgi

  20. #20
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Joe and I started the the seconpass sieve effort with SoB @ the yahoo group.

    The reason why I was asking about the p<100T for PSP, Joe and myself sieved pretty much the entire range of p<20T. With the help of Hades, Ironbits, Rycheck and a few other we brought this level up to about 70T filling in holes etc. In the process we reduced the dat significantly and reduced the memory requirements.

    I know PSP sieve a large portion of the p range with 1-20M upto something like 80T. Later they went back and started again with the 991<n<50M dat to reduce the size as joe and I did, just curious how far this effort got before they decided to simply skip their secondpass effort.

    Thanks for the speed updates with respect to k's in the dats.

  21. #21
    Senior Member
    Join Date
    Jun 2005
    Location
    London, UK
    Posts
    271
    OK, my turn for a question. Current SoB.dat has 1114308 lines, so -11 (nos k, lown, highn and 8 k= lines) = 114297 unfactored n's.

    How many would be left if we removed any that have been factored?

    I know there would be no speed increase, but this was really the question I had when I started this thread! If I had the complete list of factors (i know it's big) I wouldn't need to ask this question. :-)
    Quad 2.5GHz G5 PowerMac. Mmmmm.
    My Current Sieve Progress: http://www.greenbank.org/cgi-bin/proth.cgi

  22. #22
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Greenbank,

    A very simple question but there is even a more simple solution.

    Joe has talked about updating the current dat file recently, since we havn't done this in quite some time. Part of the reason was we didn't see as drastic of a change in the datsize per update since we are now sieving large p's.

    Perhaps an updated dat with removed factors is short order?

    I havn't pushed Joe at all on the dat since he has been busy with the client.

    Have you completed your 8T range and sent the factors to Joe? That range in itself would make a dint I'm sure.

  23. #23
    Sieve it, baby!
    Join Date
    Nov 2002
    Location
    Potsdam, Germany
    Posts
    959
    Quote Originally Posted by vjs
    I know PSP sieve a large portion of the p range with 1-20M upto something like 80T. Later they went back and started again with the 991<n<50M dat to reduce the size as joe and I did, just curious how far this effort got before they decided to simply skip their secondpass effort.
    PSP resieved the complete range (from 1 < p < 109T right now, incl. reservations) with the n<50M dat.

  24. #24
    Senior Member
    Join Date
    Jun 2005
    Location
    London, UK
    Posts
    271
    Quote Originally Posted by vjs
    Have you completed your 8T range and sent the factors to Joe? That range in itself would make a dint I'm sure.
    I've been submitting any new factors every day. So far (77% of the range done) I've submitted 1006 SoB factors (and 2431 PSP factors!).

    ETA for completion of the range is Apr 7th. See link in sig for more details.
    Quad 2.5GHz G5 PowerMac. Mmmmm.
    My Current Sieve Progress: http://www.greenbank.org/cgi-bin/proth.cgi

  25. #25
    Senior Member
    Join Date
    Jun 2005
    Location
    London, UK
    Posts
    271
    VJS/Joe_O, do you mind if I copy the following two images out:

    http://ph.groups.yahoo.com/group/Sie...view/558c?b=10
    http://ph.groups.yahoo.com/group/Sie.../view/558c?b=5

    and host them on my site so that people over at the PSP forum can see them without having to go through the hassle of Yahoo Groups?
    Quad 2.5GHz G5 PowerMac. Mmmmm.
    My Current Sieve Progress: http://www.greenbank.org/cgi-bin/proth.cgi

  26. #26
    Moderator Joe O's Avatar
    Join Date
    Jul 2002
    Location
    West Milford, NJ
    Posts
    643
    Quote Originally Posted by Greenbank
    VJS/Joe_O, do you mind if I copy the following two images out:

    http://ph.groups.yahoo.com/group/Sie...view/558c?b=10
    http://ph.groups.yahoo.com/group/Sie.../view/558c?b=5

    and host them on my site so that people over at the PSP forum can see them without having to go through the hassle of Yahoo Groups?
    Go for it!
    ps
    These and other images have been posted in this forum. Search for png attachments.
    Joe O

  27. #27
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Greenbank take what ever you need or want, I actually encourage your distribution.

    I'd only hope that PSP would make graphs similar to the ones Joe and I made. I'd actually like to see a graph of remaining factors vs p sieved for PSP.

  28. #28
    Senior Member
    Join Date
    Jun 2005
    Location
    London, UK
    Posts
    271
    Ooh, remaining factors vs sieved p. What a nice idea. Think I'll get perl and gnuplot onto that straight away (for SoB anyway, don't have access to the PSP factors).
    Quad 2.5GHz G5 PowerMac. Mmmmm.
    My Current Sieve Progress: http://www.greenbank.org/cgi-bin/proth.cgi

  29. #29
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    We actually have that for SoB already... I'd have to dig through my records for the 1-20M, but the 1-50M is already plotted in the groups.

    (Edit: here is the one for <100T and 1-20M)
    http://ph.groups.yahoo.com/group/Sie.../view/558c?b=1

    The plots basically show a tremendous number of factors found at low p and an ever diminishing return for higher p. No worries however this effect start to trail off by 50T and there isn't much of a decrease after 500T.

    (Edit: here is the one for remaining pairs with p-sieve for the 991-50M dat, also not up to date)
    http://ph.groups.yahoo.com/group/Sie.../view/558c?b=8

    It's an interesting plot and worth the effort if you'd like to make a pretty graph.

  30. #30
    Knight of the Old Code KWSN_Dagger's Avatar
    Join Date
    Feb 2005
    Location
    Western Canada
    Posts
    61
    Probably a little off topic, but i found 2 factors n<6m. Seeing as how the current window is 6.7m <n> 6.9m, would they get tested at all? Or will it just drop off the map so to speak?

  31. #31
    Quote Originally Posted by KWSN_Dagger
    Probably a little off topic, but i found 2 factors n<6m. Seeing as how the current window is 6.7m <n> 6.9m, would they get tested at all? Or will it just drop off the map so to speak?
    They are ALREADY PRPed twice and were anyways not to be tested again; But that's what you meant and in these terms, are indeed useless. But it's nice to have them and adjusting the dat file the way they wouldn't have been searched for would not give a big speed increase. Somehow useless, but for free.
    H.
    ___________________________________________________________________
    Sievers of all projects unite! You have nothing to lose but some PRP-residues.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •