Results 1 to 18 of 18

Thread: When to switch to a sieve depth of 48?

  1. #1

    Question When to switch to a sieve depth of 48?

    Recently, I've been using a sieve depth of 47 since the sieve has totally completed up to 2^47 ~ 140T.

    But it may soon (or already) make sense to use 48. We have already crossed the 90% sieve point to 200T and much between 200T and 281T has been sieved as well.

    MikeH or Nuri would know best here. Does somone know what percentage of the factors between 2^47 and 2^48 have been been found already by sieving? I would guess that even though less than 50% of the 2^47-2^48 range has been sieved, probably well over 50% of the total factors in that range have been found. If that's the case, I think the time to switch to the higher depth number is now.

    Mike, how do you feel about adding a stats page somewhere showing the % of factors left to find in each power of 2 range? A page that showed something like:

    .
    .
    .
    45-46: 00.01%
    46-47: 00.12%
    47-48: 32.84%
    48-49: 89.23%
    .
    .
    .

    Of course these numbers will nessisarily be estimates since you can't know in advance exactly how many factors are in each range but a rough idea is all we need. It could then conclude at the bottom what sieve depth to use for factoring by choosing the high end of the highest range that has a density less than 50% left (48 in my example numbers.)

    Another reason to move to 48 is because we currently don't have enough computing power devoted to factoring to do an optimal level of work anyway. So if we put the sieve depth 1 higher, it will speed up (aka lower the bounds of) the tests making it possible to cover larger ranges before SB gets to them.

    I look forward to hearing your comments.

    Cheers,
    Louie

  2. #2
    Junior Member
    Join Date
    Aug 2003
    Location
    France
    Posts
    24
    I use a low suboptimal B1=10000 B2=97500 to get the ability of testing more numbers (is it better to test a maximum of numbers with low bounds or keep large gaps of untested numbers between optimally tested ones??) -message I posted when reserving my working range-

    - garo answer ->
    Given the fact that a large number of "numbers" are not getting any P-1 I think you are on the right track. It is more efficient to do a little P-1 on all the numbers than to do a lot on only some numbers. Of course, ideally one would want to do a lot of all the numbers but since that is not possible....
    In my case, with a celeron 400, I choose to use lower bounds than "allowed", cheating with parameters to get a B1=10000 and a B2=10*B1... the reason is that such B1 is optimal for 20 digits numbers for ECM and that it should be enough to find proth factors * and to test more than 4 composites a day !! * (ECMNET site to get the optimal parameters table :http://www.loria.fr/~zimmerma/records/ecmnet.html).

    So much numbers aren't tested with the actual ressources devoted to P-1... (In an other hand, I know that the best performance should be reached with optimal bounds. It is also true that, considering the same computation time, a probability of finding 1 factor over 20 numbers with high bounds is the same than a probability of finding one over 100 with lower bounds and faster computations). It does hoewer apply to ECM too : one can compute few curves with high bounds but the common practice is to compute many dozens of smaller ones...

    for info I use parameters : 39 0.35 150

    Bye.

  3. #3
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    I've done a quick calculation based on Mike's pages, and it seems like we're finished with 39.8% of the 140T-280T range and have found 44.8% of the total expected unique factors for this range.

    Currently, the cumulative sieving speed is roughly 1.05T per day. If we continue to sieve at current rate, it is highly likely that we'll finish almost all the way up to 280T (i.e. 95% sieve point) before the end of April. There should be roughly 5,200 unique factors for the holes up to 280T for 6m<n<20m.


    As far as the factoring is concerned, the time my computer takes to finish PRP testing of a k/n pair is roughly 100 times that of it takes in P-1 testing. According to the project stats page, we have roughly 1,000 active PCs running PRP. So, it seems to me that, for the time being, a dedicated computing power of 10 PCs allocated to factoring should be sufficient to keep up with the PRP. I don't know, may be I'm calculating something wrong.

    My current P-1 range has 216 k/n pairs. With the 47 switch (and 160 MB), expected number of factors for the entire range is 1.87, which is less than 1%. If I understand it correctly, when we switch to 48 now, we'll get even less. Or, in other words, as we continue to do sieving, the benefit we'll get from P-1 will diminish even more.


    I'm not sure if there was a mistake in my computer (or my command entry), but assuming the rate of factors is 1%, even if we do P-1 factoring of everything above n=6m up to n=20m (~409,000 k/n pairs), we'll end up with roughly 4,100 factors.


    And again, may be I am calculating something wrong here, but I still feel like sieving is still significantly effective. Even only the next 80 days of sieving will find more unique factors than the total of what we might expect from P-1 from here up to 20m.


    I'm not trying to say we should not be factoring. What I have in mind is, we should allocate enough resources to factoring so that we'll cover all k/n pairs before they are PRP tested (which I guess will not be very hard). My PC does 11 factoring tests per day. Currently, the project PRP speed is 150 tests per day. This alone is roughly 8% of what should be covered. I'm pretty sure that we'll be able to keep up with PRP soon.
    Last edited by Nuri; 02-05-2004 at 04:02 PM.

  4. #4
    Junior Member
    Join Date
    Aug 2003
    Location
    France
    Posts
    24
    Nuri,

    Since I understand your arguments, I do not really agree your computations : P-1 find factors near the range tested and sieving covers all the 6-20M range : what about all the 5-20M, n=5359 unusefull factors previously found (since a prime has been discovered) : such computations should take in account the probability of finding primes during the YEARS it will take to reach 20M (or maybe new algorithms who knows).

    The great interest of P-1 is that it concerns numbers assigned a few weeks later and directly effects the prp speed (if you consider the factors found by sieving in the active window, that have the same direct effect, they are much less than 5200 in the 47-48bits range).

    If I have 2% chance to skip a prp with a 1% extra-computation, I'll take it (and hope it could be integrated to the client soon)...but I do agree that sieving is still effective... I think the question is not to choose btw methods but to coordinate the efforts.

    This has long been discussed before, should we continue??? :bs:

    Best wishes.
    Last edited by MJX; 02-05-2004 at 04:24 PM.

  5. #5
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    Of course.... I agree with most of what you say.

    May be I could not express myself clear enough.

    What I say is:

    All we need to do is dedicate 10 average PC equivalent of computing power for P-1 to keep up with PRP, and that won't be a problem at all.

  6. #6
    Agreed! And while we do not have enough resources, people should use a factor of 1.2 or 1.3 for the P-1 so as to speed up the factoring. This factor tells us approximately how many prp tests a factor is going to save.

  7. #7
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479
    Mike, how do you feel about adding a stats page somewhere showing the % of factors left to find in each power of 2 range? A page that showed something like:
    Not quite finished yet, but it'll look something like this (the numbers are correct for yesterday).

    Code:
        range                      range remaining    factors remaining
     2^45 - 2^46 (  35T -   70T)             0.00%                0.00%
     2^46 - 2^47 (  70T -  141T)             0.00%                0.00%
     2^47 - 2^48 ( 141T -  281T)            59.43%               48.63%
     2^48 - 2^49 ( 281T -  563T)            99.68%               99.57%
     2^49 - 2^50 ( 563T - 1126T)           100.00%               99.99%
    So from this, the recommended would be '48'? because the factors remaining in 2^47 - 2^48 is less than 50%?

    So this would change to '49' when factors remaining for 2^48 - 2^49 is less than 50%?

    Does that sound correct? Any other comments?

  8. #8
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    I'm not sure if 50% is a good indicator to switch to 48. However, since it will take only ~3 months for the sieving to complete most of the ranges below 281T, I think we can confidently switch to 48.

  9. #9
    Senior Member dmbrubac's Avatar
    Join Date
    Dec 2002
    Location
    Ontario Canada
    Posts
    112
    Well, forgetting the fancy math for a moment, I've been factoring on 3 machines for a couple weeks now (1/2 a dual P3 1.3, an Athlon 2200+ and a P4 2400) and just that contribution is filling in a lot of gaps.

    It looks like 2 or 3 more machines will make it obvious we are pulling away from PRP. Judging by the number of people reserving ranges, 10 to 15 machines total feel right.

  10. #10
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    Exactly.

    Yesterday, I changed my dept setting from 47 to 48 and xPRP setting (I don't remember what you call it, but I hope it is clear what I mean) from 1.5 to 1.2, and as a result, I am now able to test 25 k/n pairs per day on a single PC. Roughly speaking, this is 1/6 of the number of tests we should do per day to keep up with PRP effort.

    It's highly likely that if we have 10-15 dedicated machines, we'll soon even have the luxury to increase our xPRP settings a bit. I mean, there will hopefully not be a need to use extremely low xPRP settings only for the sake of keeping up with the PRP.

  11. #11
    how many faqctors are expected per power of 2? Is it a steady number or dos it grow/ shrink as time passes?

  12. #12
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479
    how many faqctors are expected per power of 2? Is it a steady number or dos it grow/ shrink as time passes?
    Here's the real data (the last estimates probably a bit off right now, but not bad for 61 factors )
    Code:
        range                      range remaining    factors remaining    factors found/estimated total
     2^30 - 2^31 (   0T -    0T)             0.00%                0.00%            32068/32068
     2^31 - 2^32 (   0T -    0T)             0.00%                0.00%            30026/30026
     2^32 - 2^33 (   0T -    0T)             0.00%                0.00%            28383/28383
     2^33 - 2^34 (   0T -    0T)             0.00%                0.00%            26589/26589
     2^34 - 2^35 (   0T -    0T)             0.00%                0.00%            25028/25028
     2^35 - 2^36 (   0T -    0T)             0.00%                0.00%            23663/23663
     2^36 - 2^37 (   0T -    0T)             0.00%                0.00%            22234/22234
     2^37 - 2^38 (   0T -    0T)             0.00%                0.00%            21377/21377
     2^38 - 2^39 (   0T -    1T)             0.00%                0.00%            20275/20275
     2^39 - 2^40 (   1T -    1T)             0.00%                0.00%            19254/19254
     2^40 - 2^41 (   1T -    2T)             0.00%                0.00%            18383/18383
     2^41 - 2^42 (   2T -    4T)             0.00%                0.00%            17225/17225
     2^42 - 2^43 (   4T -    9T)             0.00%                0.00%            15895/15895
     2^43 - 2^44 (   9T -   18T)             0.00%                0.00%            15345/15345
     2^44 - 2^45 (  18T -   35T)             0.00%                0.00%            14447/14447
     2^45 - 2^46 (  35T -   70T)             0.00%                0.00%            13454/13454
     2^46 - 2^47 (  70T -  141T)             0.00%                0.00%            13531/13531
     2^47 - 2^48 ( 141T -  281T)            56.43%               53.45%             6183/13282
     2^48 - 2^49 ( 281T -  563T)            99.67%               99.58%               61/14683

  13. #13
    Originally posted by MikeH
    Code:
        range                      range remaining    factors remaining    factors found/estimated total
     2^45 - 2^46 (  35T -   70T)             0.00%                0.00%            13454/13454
     2^46 - 2^47 (  70T -  141T)             0.00%                0.00%            13531/13531
     2^47 - 2^48 ( 141T -  281T)            56.43%               53.45%             6183/13282
     2^48 - 2^49 ( 281T -  563T)            99.67%               99.58%               61/14683
    Seems odd that there are actually more factors in the higher ranges.

    I understand that the ranges are getting 2x larger, but still, shouldn't there be less in each successive range? The 2^45 - 2^46 isn't even an estimate and it is larger than the number of factors 2^44 - 2^45.

    Also, how do these numbers take into account low ranges (does it go down to 300k, 1M, 3M?) and the prime found since factoring started (k=5359).

    Cheers,
    Louie

  14. #14
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    Looking at the figures, and assuming they are correct, my guess is:

    The data is for 1m<n<20m range, for only unique factors, and it excludes any factor found for k=5359.

    Normally, there seems to be roughly 6% drop in number of unique factors found for each successive range.

    In my opinion, the reason for the increase from 2^45 - 2^46 to 2^46 - 2^47 is that, almost everything for the 2^46 - 2^47 range was sieved with 1m<n<20m SoB.dat, whereas there are more ranges that were sieved with 3m<n<20m SoB.dat (% wise) in the 2^45 - 2^46 range.

    Assuming the 6% drop is a valid argument, then we should have a total of 12,769 unique factors (1m-20m) for the 2^47 - 2^48 ( 141T - 281T) range, and 12,050 for the 2^48 - 2^49 ( 281T - 563T) range.

    Also based on the data on the table above, the minimum decrease in the number of factores per successive range is 3.6%, and the maximum decrease is 8.4%. If we consider them to be our boundaries, then the range for expected # factors is 12,486 - 13,063 for 141T - 281T range, and 11,522 - 12,611 for 281T - 563T range.


    Please note that these are simply my guesses based on some back of the envelope calculations, and Mike could answer the question much better.

  15. #15
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479
    Nuri, I think you've hit the nail on the head.

    These are the factors for the range 1M < n < 20M, so there will be some gaps in the 1M < n < 3M bit that could be influencing the anomaly we are seeing here.

    One thing you've just raised though, 5359 - I haven't factored this into the equation at all. So that means all the factors that we found for 5359 are included throughout, and are also influencing the estimates.

    So what we should read from this is that the last two estimates are probably quite a way off, and will slide down over time as more factors are submitted.

  16. #16
    could the same data be generated with the 1-3 million data removed and 5359 data removed to get a good look at the trends.

  17. #17
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    Originally posted by Keroberts1
    could the same data be generated with the 1-3 million data removed and 5359 data removed to get a good look at the trends.
    Attached Images Attached Images

  18. #18
    Member
    Join Date
    Feb 2003
    Location
    Lucerne, Switzerland
    Posts
    30
    As far as I know there is an approximation which says:

    The probability that Number has a prime factor of the size 2^n-1 to 2^n is about 1/n.

    So, the expected number of factors in a range of a candidates is a/n.

    I think this approximation can be applied here.
    In order to caculate the expected number of prime factor of the range 2^45 to 2^45 take the naumber of remaining candidates + candidates which have a factor which is bigger than 2^45 and divide by 46.

    If the range below that is completely sieved, that is no problem; to extend the prediction of estimates up to higher powers, the previous estimates can be taken into account.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •