Results 1 to 10 of 10

Thread: sieve stats are going crazy

  1. #1
    Senior Member
    Join Date
    Jan 2003
    Location
    U.S
    Posts
    123

    stats are going crazy

    I just looked at Mike H's sieve stats, and things don't look good


    First of all, my score dropped by 1500 points
    Second, we have 28,000 MORE candidates to test
    Third, biwema somehow makes up 66% of the
    sieving effort, while 2 days ago he made up only 3-4% :shocked:

    It appears that P-1 factoring is causing this.
    I'm obsessed with stats, so unless this is fixed, what is
    the idea of P-1 factoring ? :bs:

    Important note: I am not blaming Mike H for this. He has done a
    great job of constructing and maintaining the sieve stats, but unfortunately, the P-1 factors make almost all methods of
    calculating scores ineffective

  2. #2
    Senior Member
    Join Date
    Jan 2003
    Location
    U.S
    Posts
    123
    I investigated this problem a bit and it turns out that a
    change in the result.txt file was also a reason for this problem;
    not only P-1 factoring. My bad

  3. #3
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752

    Re: stats are going crazy

    Originally posted by Moo_the_cow
    First of all, my score dropped by 1500 points
    Second, we have 28,000 MORE candidates to test
    Third, biwema somehow makes up 66% of the
    sieving effort, while 2 days ago he made up only 3-4% :shocked:
    The reason for the first two is that Louie changed the lowerbound for results.txt from 1T to 3T. As far as I can see, Mike's program takes the data on results.txt, and incorporates the changes in it, while assuming the data below that as unchanged. Since the lowerbound is changed to 3T, todays stats do not reflect the factors for 1T<p<3T. I'm sure Mike will fix this soon.

    As far as the third point is concerned, you're right. It's because of P-1 factors. I'm sure that will also be fixed as soon as we decide how to score P-1 factors.

    EDIT: Moo, you found the reason faster than I wrote.

  4. #4
    Woo hoo! I'm number 2 in the stats!!!

    Ah I better enjoy it till the stats are fixed.

  5. #5
    Member
    Join Date
    Feb 2003
    Location
    Lucerne, Switzerland
    Posts
    30

    suggestion for cleaning the sieving stats

    Hi,

    It was easy to predict that the new distribution of the p-1 factors will mess up the statistics quite a bit. One factor could score up to 18 million (64 bits).
    It was really funny to see me on the top of the statistics, but believe me, it is quite embarrassing and I feel quite exposed. I hope that this will be fixed soon because this scoring is not fair at all. Sorry if I made some trouble with that.

    What can be done?
    The best thing is to separate the factors found by sieving and p-1.
    The scoring of the factors found by sieving is quite good, and we can leave it as it is. The factors found by p-1 should be scored proportional to the effort they save, when we don’t need to do PRPing. Due to the fft-sizes, this saved effort will be more or less proportional to the computing effort of the p-1 test.

    How can we separate these factors?
    If the range is not too high, we can easily recognize the factors which are found by sieving if a whole range is submitted by one user.
    Factors found by p-1 somewhere beyond that and therefore probably not in a reserved range.

    Now it is possible, that some users find big factors and want that they are scored as sieved factors. To prevent that people just reserve that range around this factor, we can do the following test:

    If a big factor is found and there is a small range reserved around it:

    * Did this user find 2 or more factors in that range (but the expected number of factors should not be significantly higher)?
    If so, it is not very probable that p-1 finds 2 factors which are close to each other, and these factors might have been found by sieving.

    * If there is one factor in the range: Is the expected number of factors in that range more then 0.2?
    If not, it is very unlikely that this user just picked that range and found that factor by sieving. Hence, this factor is probably a p-1 factor.
    If the expected number of factors if more than 0.2, and the factor big, that range must be 100G or more what would be quite suspicious for users who did not complete so much factoring before (check smoothness here). If the factor is not so big, the range is smaller, but in that case this user won’t get a height score with one factor. So he need to do that with several factors what would look suspicious (also check smoothness of factor).

    * Check if p-1 is smooth.
    The bigger the factor get the smaller is the probability that a factor can be found with a b2 <100 million. For example, all five factors beyond 1P, which were found by sieving are not smooth (a b2 of more than 200 million would be necessary)
    If not,

    Using these 3 tests, it might be easy to separate all the factors. I guess these tests would not be necessary too often if people didn’t try to cheat.

    Scoring of p-1 factors:
    The goal is to set the weight of a successful p-1 to such a value, that a computer scores as much with sieving as with p-1 factoring. I think, it will be something like score=c*exp² or similar.

    I hope we can clean up the sieving stats soon, because it is one of the main motivator for many sievers.

    Nevertheless, have fun

    biwema

  6. #6
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479
    Since the lowerbound is changed to 3T, todays stats do not reflect the factors for 1T<p<3T. I'm sure Mike will fix this soon.
    Sorry everyone, I had a 7 hour power outage at home last night, so I wasn't able to do anything. Hopefully all will be OK tonight.

  7. #7
    Sieve it, baby!
    Join Date
    Nov 2002
    Location
    Potsdam, Germany
    Posts
    959
    biwema:

    Don't put too much effort into distinguishing sieving factors from P-1 factors. When I find the time for a project of mine, there will be no chance to circumvent the reservation criteria by reserving the range the P-1 factor lies in anymore. It will still take 2-3 weeks from now, though...

  8. #8
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479
    OK, the stats are now using the new 3T file and updated results file, but they are not yet fixed.

    I am experimenting with a few ideas (from biwema, others and myself), but basically I will make p-1 factors roughly equal to a PRP test of that n, but cap the n at a little (maybe 0.5-1M) above the top of current PRP window.

    I will also change stats for sieving so that again, the max score for any factor will be the same as that for a PRP test, but I'll make sure that any scores up to the point before p-1 factoring started will be maintained, so no one will lose out.

    If I don't get this done in the next few days, it'll be about 10 days before I can get this done. Watch this space. P-1 factoror's enjoy your five minutes of fame.

    Louie, I have noticed one potential issue/problem. In my Tuesday update I had a factor for garo (961094450858074349 | 67607*2^4022171). This factor, nore any other for this k/n pair can be seen in the current results.txt (which means garo goes from possition 2 to 60). Where has it gone?
    Last edited by MikeH; 06-18-2003 at 04:58 PM.

  9. #9
    Originally posted by MikeH
    Louie, I have noticed one potential issue/problem. In my Tuesday update I had a factor for garo (961094450858074349 | 67607*2^4022171). This factor, nore any other for this k/n pair can be seen in the current results.txt (which means garo goes from possition 2 to 60). Where has it gone?
    I forgot a -f in the gzip line of the script so it wasn't updating the file. it's working again.

    -Louie

  10. #10
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479
    I've made a quick fix to the stats that should keep everyone a little happier for the next two weeks, by which time I should have something a little better sorted.

    Simple change is to cap all scores at 3500 points. This roughly equates to the sieve score you'd get for sieving for the period of time it would take to PRP @n=5M.

    Thus some normality is restored.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •