Page 1 of 2 12 LastLast
Results 1 to 40 of 42

Thread: Error Rates Thus Far

  1. #1
    Code:
    Seventeen or Bust error rate report
       Tue Nov  1 12:32:09 EST 2005
    
    pulling prp test result history from database: ok
    
    distinct k/n pairs with prp results: 330789
    k/n pairs with multiple prp results: 118338
    number of dubious tests: 15067
    number of bad tests (best guess): 8212
    
    breakdown of dubious tests by client version:
    
                             prp tests   dubious      dubious
       version              considered     tests   percentage
       ------------------   ----------   -------   ----------
       2.0 TEST                     18        18      100.00%
       2.2.1 T2                      1         1      100.00%
       2.2.1 T3                      1         1      100.00%
       2.2.1 T4                      1         1      100.00%
       2.2.1                         8         3       37.50%
       2.0 TEST 6                    6         2       33.33%
       2.2                        3363       774       23.02%
       2.0                          26         5       19.23%
       2.0 SSE2                   7923      1335       16.85%
       2.0 TEST 5                   33         4       12.12%
       2.5.0                     20560      1754        8.53%
       2.4.0                     24133      2056        8.52%
       1.2.1                        39         3        7.69%
       2.3.0                     28481      2030        7.13%
       1.0.0 SMP - for MathGuy         7859       530        6.74%
       1.2.3                        63         4        6.35%
       2.0 TEST 3                  710        44        6.20%
       1.1.1                       841        52        6.18%
       1.1.0                     21139      1130        5.35%
       1.0.0                     76585      3964        5.18%
       0.9.9                      1230        57        4.63%
       1.2.5                     15951       600        3.76%
       2.4.0 TEST                 1196        37        3.09%
       2.0 TEST 7                  245         7        2.86%
       1.0.2                     16377       423        2.58%
       0.9.7                      3381        82        2.43%
       1.0.1                       210         4        1.90%
       1.2.0                      9837       144        1.46%
       0.9.8                       110         1        0.91%
       1.2.2                       537         1        0.19%
    
    breakdown of dubious tests by user id:
    
                             prp tests   dubious      dubious
       user id              considered     tests   percentage
       ------------------   ----------   -------   ----------
       5202                       1268       391       30.84%
       7192                       1683       343       20.38%
       3                         31852       245        0.77%
       1800                       2092       232       11.09%
       3158                       2827       216        7.64%
       631                        4273       212        4.96%
       7092                        727       160       22.01%
       5584                       3059       152        4.97%
       8141                        626       136       21.73%
       527                         948       134       14.14%
       6308                        550       133       24.18%
       1143                        549       131       23.86%
       589                         406       129       31.77%
       6351                       3173       128        4.03%
       5965                       1996       121        6.06%
       1154                        549       119       21.68%
       2277                        483       113       23.40%
       1269                        741       109       14.71%
       4713                       2253       109        4.84%
       8035                        714       107       14.99%
    
    breakdown of dubious tests by ip address:
    
                             prp tests   dubious      dubious
       ip address           considered     tests   percentage
       ------------------   ----------   -------   ----------
       129.x.x.x                  441       395       89.57%
       69.x.x.x                  5843       389        6.66%
       67.x.x.x                   487       185       37.99%
       62.x.x.x                   1408       179       12.71%
       62.x.x.x                  7751       164        2.12%
       68.x.x.x                    732       160       21.86%
       66.x.x.x                   552       133       24.09%
       24.x.x.x                    374       112       29.95%
       220.x.x.x                 2725       103        3.78%
       216.x.x.x                 7010        97        1.38%
       65.x.x.x                    318        89       27.99%
       62.x.x.x                   491        88       17.92%
       70.x.x.x                  2043        87        4.26%
       4.x.x.x                     813        79        9.72%
       12.x.x.x                    329        76       23.10%
       67.x.x.x                  509        72       14.15%
       131.x.x.x                 151        69       45.70%
       205.x.x.x                5844        69        1.18%
       204.x.x.x                  298        68       22.82%
       12.x.x.x                 159        67       42.14%
    
    breakdown of dubious tests by assignment time:
    
                             prp tests   dubious      dubious
       assignment time      considered     tests   percentage
       ------------------   ----------   -------   ----------
       2002-Dec                  21420       817        3.81%
       2002-Nov                   6676        87        1.30%
       2003-Apr                   8013       619        7.72%
       2003-Aug                   4221       274        6.49%
       2003-Dec                   1971        51        2.59%
       2003-Feb                  10071       869        8.63%
       2003-Jan                  14908      1040        6.98%
       2003-Jul                   8258       445        5.39%
       2003-Jun                  12104       443        3.66%
       2003-Mar                  22696       725        3.19%
       2003-May                  11526       462        4.01%
       2003-Nov                   1517        54        3.56%
       2003-Oct                    855        45        5.26%
       2003-Sep                   1079        60        5.56%
       2004-Apr                   2527        34        1.35%
       2004-Aug                   2143       132        6.16%
       2004-Dec                   4701       899       19.12%
       2004-Feb                   3024        18        0.60%
       2004-Jan                   2818        28        0.99%
       2004-Jul                   2353        74        3.14%
       2004-Jun                   5721        76        1.33%
       2004-Mar                   1179        28        2.37%
       2004-May                   3197        71        2.22%
       2004-Nov                   3015       348       11.54%
       2004-Oct                   2792       349       12.50%
       2004-Sep                   2643       180        6.81%
       2005-Apr                   3544       236        6.66%
       2005-Aug                   4226       402        9.51%
       2005-Feb                   6664       466        6.99%
       2005-Jan                   7247       805       11.11%
       2005-Jul                   4871       399        8.19%
       2005-Jun                   5293       349        6.59%
       2005-Mar                   4912       424        8.63%
       2005-May                   4207       253        6.01%
       2005-Oct                  35258      3135        8.89%
       2005-Sep                   3260       370       11.35%
    
    breakdown of dubious tests by n range:
    
                             prp tests   dubious   bogus    estimated
       n range              considered     tests   tests   error rate
       ------------------   ----------   -------   -----   ----------
                                   803        72      37        4.61%
       0.0M, 0.5M                36677        73      65        0.18%
       0.5M, 1.0M                30952       167     154        0.50%
       1.0M, 1.5M                24920       728     536        2.15%
       1.5M, 2.0M                24507      1152     606        2.47%
       2.0M, 2.5M                24526      1673     860        3.51%
       2.5M, 3.0M                24864      2190    1122        4.51%
       3.0M, 3.5M                23808      2081    1074        4.51%
       3.5M, 4.0M                20916      2022    1059        5.06%
       4.0M, 4.5M                14201      1318     697        4.91%
       4.5M, 5.0M                 1515       112      84        5.54%
       5.0M, 5.5M                 1190       146      88        7.39%
       5.5M, 6.0M                 1000       136      79        7.90%
       6.0M, 6.5M                 1508       255     156       10.34%
       6.5M, 7.0M                 1565       533     309       19.74%
       7.0M, 7.5M                 1323       630     343       25.93%
       7.5M, 8.0M                 2783      1209     635       22.82%
       8.0M, 8.5M                 1436       230     125        8.70%
       8.5M, 9.0M                 1377       185     102        7.41%
       9.0M, 9.5M                 1044       155      81        7.76%

  2. #2
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    MY GOD,

    I have one of the highest error rate machines on prp!! Could you tell me in a PM what the 129.x.x.x IP is? I'm going to take that machine off line asap.

    Looks like I'm in good comany for high error rates. E, Wilden, we also have a bad machine or two.

    An average error rate of at least 7% from hear on, about 1 in 14 tests.

    Quick calculation for time firstpass vs secondpass should probably be in the area of, what 1 to 12 or less???

    Complete 12 secondpass tests in the time it takes to complete a firstpass.

    This is certainly poetic justice, for myself.

    Also what is the difference between dubious and bogus.

  3. #3
    Alien 88,

    Could give the breakdown of dubious tests by user id ordered by percentage?

  4. #4
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Yes or even better yet % by IP with userID. This would help us try to identify problem machines, I'm not sure if posting full IP adresses is a great idea, perhaps the first three and last three digits? Or the last three digits and the userID.

    Also alot of these lines are not real error rates they are selected error rates which are artifficially high, correct?

    Example

    Ver 2.5.0, you only considered 20560 tests of which 1754 were dubious a 8.53% dubious rate. I'd think that we tested alot more tests with version 2.5.0 than 20560 total tests.

    ------------------------------------------------------------------

    I'm still curious about the definitions of; PRP tests considered ? Dubious? Bogus?
    You guys have actually done a great job picking out errors.

    Are these decent definitions to the following terms???

    PRP tests considered, tests which meet a certain criteria, be it..

    - exceptional long time
    - unassigned
    - from error prone user
    - mismatching client versions
    - Mismatching residues
    - etc

    I have a feeling the answer to this is simply, "Yes all of the above and then some depending one what your looking at."

    Dubious test,
    Mismatching residues gives both tests a dubious test score.
    Dubious = Mismatching residues for the same k/n pair


    Bogus
    Three tests ... two matching (correct tests) the third is a Bogus test (Confirmed error test).
    Bogus = Multiple residues, at least two match, the mismatch is a bogus.

    I have a feeling these are the defintions correct me if I'm wrong.

    ----------------------------------------------------------------------------

    If the above is correct, I would think these two lines basically show the current error rate.

    Code:
                             prp tests   dubious      dubious
       assignment time      considered     tests   percentage
       ------------------   ----------   -------   ----------
       2005-Oct                  35258      3135        8.89%
       2005-Sep                   3260       370       11.35%
    Especially Oct05, I would assume these are all of the secondpass tests completed during the month of October. Of these completed tests 8.89% of the residues did not match the previously obtained residues. These error are either client or user based. We also don't know if the incorrect residue occured during the first or secondpass test since there were only two tests performed.

    If true this basically means we have a ~0.99 confidince level with all tests n<4M. But it also means we have a ~1% chance we have still missed a prime n<4M not bad but not great. Good news is we would only have to recheck 1% of the tests or less b/c some have already been tripple checked.


    Code:
                             prp tests   dubious   bogus    estimated
       n range              considered     tests   tests   error rate
       ------------------   ----------   -------   -----   ----------
                                   803        72      37        4.61%
       0.0M, 0.5M                36677        73      65        0.18%
       0.5M, 1.0M                30952       167     154        0.50%
       1.0M, 1.5M                24920       728     536        2.15%
       1.5M, 2.0M                24507      1152     606        2.47%
       2.0M, 2.5M                24526      1673     860        3.51%
       2.5M, 3.0M                24864      2190    1122        4.51%
       3.0M, 3.5M                23808      2081    1074        4.51%
       3.5M, 4.0M                20916      2022    1059        5.06%
    --------

    Edit, Sorry it didn't dawn on me at first that you divide up the n-range into 0.5M sections (~12K tests per 0.5M n-range). I guess the estimated error rates are probably pretty close to reality. However I do have a hard time believing in the ~20%'s around 7-8M... 1 in 5 tests... :shocked:
    Last edited by vjs; 11-01-2005 at 05:27 PM.

  5. #5
    Originally posted by vjs
    However I do have a hard time believing in the ~20%'s around 7-8M... 1 in 5 tests... :shocked:
    I don't. I'm pretty sure that it has to do with this (excerpt from the news page):

    <--

    PRIME !! PRIME !! PRIME !! PRIME !!
    Sunday, 02 Jan 2005

    28433 * 2^7830457 + 1 is prime!


    Bug fixed - Windows Upgrade Critical - Also Linux Upgrade
    Friday, 24 Dec 2004

    A very critical bug was found by KenG6 of Team Anandtech and has now been fixed.


    v2.2 Client -- Algorithmic Upgrade for All
    Saturday, 11 Dec 2004

    v2.0 client for SSE2 processors (FASTER!)
    Friday, 29 Oct 2004

    -->


    Copied from Alien88 post: (last number per line is the error rate)

    2004-Dec 4701 899 19.12%

    2.0 TEST 18 18 100.00%
    2.2.1 T2 1 1 100.00%
    2.2.1 T3 1 1 100.00%
    2.2.1 T4 1 1 100.00%
    2.2.1 8 3 37.50%
    2.0 TEST 6 6 2 33.33%
    2.2 3363 774 23.02%
    2.0 26 5 19.23%
    2.0 SSE2 7923 1335 16.85%


    If you do have a look at the breakdown of dubious tests by n range, and keep in mind the buggy 2.x clients, i think we have an interesting picture:

    The error rate climbs relativly smooth with the growth of n, from

    (roughly rounded)

    0.35 % for 1M
    2.3 % for 2M
    4.0 % for 3M
    4.75 % for 4M
    5.3 % for 5M
    7.5 % for 6M (already influenced by buggy client? )

    and is around 8-9 % now (but hard to tell, because most tests have not been secondpassed for n > 6 M right now)


    Something else to consider:

    Firstpass test that missed the prime was done at: around summer 2003.
    Prime finally found: October 2005.

    Difference: 26-28 month

    Special unfortunate thing to notice: k=4847 was a relatively dense series that required more tests in a given range than the average k.

    Special fortunate thing to notice: When secondpass was at 2.7M, the creaters decided to commit all ressources to secondpass

    Estimated time secondpass would have discovered the missed prime without this efford: Probably (late?) in 2006

    So, when we look at how secondpass was handled in the past, 2 questions arise:

    Month of project time wasted: ??
    Month of project time not wasted: ??

    The two extremes we are facing is:

    - doing double work by rechecking relatively close to firstpass (but knowing for sure (as sure as it can get) about the result)
    - doing no secondpass at all and hope for the next prime to come quicker, while sacrifying any security about the results that are sent back being correct.

    We should keep in mind that:
    - the tests are getting larger and larger
    - the error rate, although still low, seems to grow (smoothly) with tests getting bigger
    - the density of primes at higher n gets lower


    It might be helpful to discuss a 'policy' for secondpass, so the projects computational power is used in an optimal way.

  6. #6
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    Apart from the buggy 2.x client, I guess the error rates for n>4.5m are skewed upwards as the data for those ranges contain dropped (and reassigned) tests, where the first test was returned later.

    I guess it would be interesting to see the breakdown of error rates from the time to complete perspective as well. I feel like something like,

    0 to 10 days
    10 to 30 days
    30 to 60 days
    60 to 90 days
    90 days or more

    would be interesting to see.

    If, for example we can see an obvious increase in error rates as the duration increases, we might implement an early warning system for first time tests, say reassign immediately as soon as a test takes longer than x (90?) days to complete.

  7. #7
    when the new client comes out it should be easy to have the test automatically report back its status every week or so. If we save intermittant status updates then we can determine where the error was when mismatched residues are found. We could even run tw osets of every test simultaneously andhave a guarenteed 0% error rate because everytime an error occures it would automatically back up a few steps to hen it knew it was o nsolid ground and continue from there.

  8. #8
    Former QueueMaster Ken_g6[TA]'s Avatar
    Join Date
    Nov 2002
    Location
    Colorado
    Posts
    184
    I wonder:

    I seem to recall that running the SSE2 enabled algorithm produced a different residue than the non-SSE2 algorithm (at least for awhile). Am I remembering correctly, and did you all take this into account?

    Also, I thought I caught the bug mentioned above fairly quickly. Why would the error rate remain so stable over such an apparently long period of time?
    Proud member of the friendliest team around, Team Anandtech!
    The Queue is dead! (Or not needed.) Long Live George Woltman!

  9. #9
    I want to bring up a previous point when we discussed error rates before:

    NOTE THAT THE "DUBIOUS PERCENTAGE" COLUMN IS NOT THE SAME AS ERROR RATE!!!!!!!!

    Firstly, remember that measured error rates above the double-check threshold are completely unreliable. This is because, above that threshold, the only way for a test to have been double-checked is if the first test was dropped, handed out again, reported, and then the original dropped client reappeared and submitted its test, too. This implies the first client was either running for a very, very long time, or that it was stopped and restarted with long lags inbetween runs. This kind of behavior is almost certainly MUCH more likely to result in errors. We won't know what the REAL numbers are for a given range until we've done systematic double checks in that range.

    Secondly, it's important to realize what a "dubious test" is. A dubious test is any test where we're not confident the residue is correct. If we had five tests, with residues A, A, A, B and C, the last two (with residues B and C) would both be considered dubious. The reason this is NOT the same as "error rate" is that if there are only two tests with two different residues, BOTH TESTS are considered dubious. But most likely, only one of them is truly wrong. It's just that they're both under suspicion until we can do a third test to confirm one of the residues.

    MISINTERPRETING THESE NUMBERS can lead to headaches, ulcers, stroke, or premature death! Read on at your own risk!!

  10. #10
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    Time for a comparison!!!


    Originally posted by kugano
    Haha! You're all going to beat me over the head for this, and I deserve it. The error rate report I posted earlier (the one that only lists the n breakdown) is completely fubar'd and utterly wrong. When I was programming the report generator, I was using a small subset of the data (50,000 prp tests) to test it. When I did the final run and posted the results in this thread, I forgot to change back to the full dataset! There are really about 325,000 prp tests to consider, so the numbers in my earlier post were only based on about 15% of the actual data!!!

    Here's some MUCH BETTER data w/ estimated error rate, considering all 325,000 prp tests, including the 100 random tests I assigned yesterday.

    Note the by-user breakdown. It appears that that some specific users are experiencing extremely high error rates (probably horribly overclocked or otherwise unstable machines). It's starting to look like a case of "5% of the users are responsible for 95% of the total errors," just like world wealth =/

    Also note that the very first report I posted, the one that shows all breakdowns, is correct, using all data... it's just the second one (when I posted the extra "estimated error rate" column) that's broken.
    Code:
    Seventeen or Bust error rate report
       Fri Jan  7 13:03:47 EST 2005
     
    pulling prp test result history from database: ok
     
    distinct k/n pairs with prp results: 271138
    k/n pairs with multiple prp results: 46338
    number of dubious tests: 2851
    number of bad tests (best guess): 1735
     
    breakdown of dubious tests by client version:
     
                             prp tests   dubious      dubious
       version              considered     tests   percentage
       ------------------   ----------   -------   ----------
       2.0 TEST                     18        18      100.00%
       2.2.1 T2                      1         1      100.00%
       2.2.1 T3                      1         1      100.00%
       2.2.1 T4                      1         1      100.00%
       2.2.1                         8         3       37.50%
       2.0 TEST 6                    6         2       33.33%
       2.0                          26         5       19.23%
       2.0 SSE2                   5117       705       13.78%
       2.0 TEST 5                   33         4       12.12%
       2.2                        1452       145        9.99%
       1.2.1                        37         3        8.11%
       1.1.1                       757        49        6.47%
       2.0 TEST 3                  703        45        6.40%
       1.2.3                        61         3        4.92%
       2.2 TEST 3                   28         1        3.57%
       1.2.5                     14876       513        3.45%
       2.0 TEST 7                  244         7        2.87%
       1.0.1                       175         5        2.86%
       1.1.0                     13100       332        2.53%
       1.0.0                     32527       691        2.12%
       2.3.0                      1084        21        1.94%
       1.2.0                      9468       121        1.28%
       0.9.9                       590         6        1.02%
       1.0.2                      8939        90        1.01%
       0.9.7                      1478        13        0.88%
     
    breakdown of dubious tests by user id:
     
                             prp tests   dubious      dubious
       user id              considered     tests   percentage
       ------------------   ----------   -------   ----------
       xxxx                      30125       318        1.06%
       xxxx                      33481       137        0.41%
       xxxx                        213        53       24.88%
       xxxx                        153        42       27.45%
       xxxx                        484        41        8.47%
       xxxx                        121        35       28.93%
       xxxx                        104        33       31.73%
       xxxx                         74        30       40.54%
       xxxx                         76        29       38.16%
       xxxx                        443        26        5.87%
       xxxx                        199        23       11.56%
       xxxx                        111        23       20.72%
       xxxx                        206        22       10.68%
       xxxx                         75        21       28.00%
       xxxx                         99        21       21.21%
       xxxx                        474        21        4.43%
       xxxx                        132        21       15.91%
       xxxx                         62        20       32.26%
       xxxx                       1844        19        1.03%
       xxxx                         61        18       29.51%
     
    breakdown of dubious tests by ip address:
     
                             prp tests   dubious      dubious
       ip address           considered     tests   percentage
       ------------------   ----------   -------   ----------
       xxx.xxx.xxx.xxx            5699        73        1.28%
       xxx.xxx.xxx.xxx             242        58       23.97%
       xxx.xxx.xxx.xxx             728        57        7.83%
       xxx.xxx.xxx.xxx             193        51       26.42%
       xxx.xxx.xxx.xxx             691        51        7.38%
       xxx.xxx.xxx.xxx            4565        47        1.03%
       xxx.xxx.xxx.xxx             453        39        8.61%
       xxx.xxx.xxx.xxx             121        35       28.93%
       xxx.xxx.xxx.xxx            5170        33        0.64%
       xxx.xxx.xxx.xxx              76        29       38.16%
       xxx.xxx.xxx.xxx              43        24       55.81%
       xxx.xxx.xxx.xxx             501        22        4.39%
       xxx.xxx.xxx.xxx             405        21        5.19%
       xxx.xxx.xxx.xxx              50        18       36.00%
       xxx.xxx.xxx.xxx              90        17       18.89%
       xxx.xxx.xxx.xxx              48        17       35.42%
       xxx.xxx.xxx.xxx              16        16      100.00%
       xxx.xxx.xxx.xxx             334        12        3.59%
       xxx.xxx.xxx.xxx             108        12       11.11%
       xxx.xxx.xxx.xxx             327        12        3.67%
     
    breakdown of dubious tests by assignment time:
     
                             prp tests   dubious      dubious
       assignment time      considered     tests   percentage
       ------------------   ----------   -------   ----------
       2002-Dec                   7283       358        4.92%
       2002-Nov                   6226        94        1.51%
       2003-Apr                   2736        96        3.51%
       2003-Aug                   1707        53        3.10%
       2003-Dec                   1931        56        2.90%
       2003-Feb                    582        24        4.12%
       2003-Jan                   1161        44        3.79%
       2003-Jul                   4384        78        1.78%
       2003-Jun                   8154        65        0.80%
       2003-Mar                  14449        54        0.37%
       2003-May                   7495        64        0.85%
       2003-Nov                   1494        66        4.42%
       2003-Oct                    803        63        7.85%
       2003-Sep                    936        62        6.62%
       2004-Apr                   2504        32        1.28%
       2004-Aug                   2090       124        5.93%
       2004-Dec                   3176       372       11.71%
       2004-Feb                   3006        14        0.47%
       2004-Jan                   2610        22        0.84%
       2004-Jul                   2270        72        3.17%
       2004-Jun                   5673        72        1.27%
       2004-Mar                   1158        29        2.50%
       2004-May                   3158        65        2.06%
       2004-Nov                   2730       324       11.87%
       2004-Oct                   2621       344       13.12%
       2004-Sep                   2578       186        7.21%
       2005-Jan                   1115        18        1.61%
     
    breakdown of dubious tests by n range:
     
                             prp tests   dubious   bogus    estimated
       n range              considered     tests   tests   error rate
       ------------------   ----------   -------   -----   ----------
       0.0M, 0.5M                36667        73      65        0.18%
       0.5M, 1.0M                30335       183     149        0.49%
       1.0M, 1.5M                12889       688     356        2.76%
       1.5M, 2.0M                 1131        51      35        3.09%
       2.0M, 2.5M                 1423        45      33        2.32%
       2.5M, 3.0M                  805        40      35        4.35%
       3.0M, 3.5M                 1111        95      71        6.39%
       3.5M, 4.0M                 1734       152     122        7.04%
       4.0M, 4.5M                 1651       149     114        6.90%
       4.5M, 5.0M                 1260       134      70        5.56%
       5.0M, 5.5M                 1021       134      70        6.86%
       5.5M, 6.0M                  903       128      73        8.08%
       6.0M, 6.5M                 1289       217     129       10.01%
       6.5M, 7.0M                 1103       410     230       20.85%
       7.0M, 7.5M                  553       293     153       27.67%
       7.5M, 8.0M                  155        59      30       19.35%

  11. #11
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Thanks for explaining dubious vs bogus it makes a little more sence now.

    It pretty obvious from the results that error rate is incresaing with time taken for a test. This only makes sence since it only takes one error during the entire test to make the test a bogus, longer tests greater chance of one error occuring.

    I did a quick couple graphs of data trends etc, IMO at our current error rate we would expect to have a 10% error rate by 14M, with current hardware etc. This rate is a little high I'm thinking it should go down with faster hardware/computers of the future, etc.

    It also suggests that those 20% error rates are not representivite of the project since they only occur over a small n-range.


    Nuri and I suggested seeding the que a while back...

    I still suggest that you populate the global que or high priority que with all 5M<n<8M for one or two of the lighter k's. This would tell us what to expect for the future, I also don't think it would take that long project wise. Perhaps 22699 and or 19249.



    Can anyone check their logs for time required to complete a test,
    I'm interested in how long it took you to complete a test when n=9.9M and how long it takes with current tests n=4.5M.

  12. #12
    Senior Member
    Join Date
    Jan 2003
    Location
    UK
    Posts
    479
    Thanks for explaining dubious vs bogus it makes a little more sense now.
    Err....where exactly is the description of a bogus test?

    I see the explanation of difference between dubious test and error rate, but absolutely positively no use of the word "bogus" anywhere in that explanation!

  13. #13
    Originally posted by MikeH
    Err....where exactly is the description of a bogus test?

    I see the explanation of difference between dubious test and error rate, but absolutely positively no use of the word "bogus" anywhere in that explanation!
    Actually it is hidden. With two different residues from one test both are dubious. With three residues and one is different from the other two this one is bogus.

  14. #14
    Senior Member engracio's Avatar
    Join Date
    Jun 2004
    Location
    Illinois
    Posts
    237
    originally posted by Joh14vers6

    Actually it is hidden. With two different residues from one test both are dubious. With three residues and one is different from the other two this one is bogus.
    So is that mean I can re start all of my prp machines again after they completed their factoring/sieving? Or do I still have a bad machine somewhere? I hope it is located locally and not one of my borged machine around the country.


    e

  15. #15
    Maybe I can explain this better. Dubious indicates a doubt as to the veracity or accuracy of a test. Bogus indicates that it is believed that the test results are bad and need to be redone. At least, that's the way I understand it.

  16. #16
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Let me try one more time;

    k/n pair was tested twice giving residues A and A.
    Since residue A and A match they are both considered correct.

    k/n pair was tested twice giving residues A and B.
    Since residue A and B do not match both tests are considered dubious.

    Why? Because of the following:
    - Both A and B could be incorrect (Two incorrect tests)
    - Second more likely possibility one of A or B is correct the other is incorrect (one correct one incorrect)

    We don't know which one or if either are correct until a third test is performed, therefore they are both dubious until a third test.



    Once a third test is conducted you probably get the following:

    k/n pair was tested three times giving residues A, B, and A

    - Residues A and A are considered to be correct, residue B is bogus

    Now since two resides match there is no reason why a fouth test needs to be conducted, inother words two matching residues is good enough to say that that k/n pair is not prime.

    ---------------------------

    A case that cannot happen.

    k/n pair was tested multiple times producing A, A, B, B The possibility of have something like this happening is insanely small. The only time something like this could happen is if two different client were used one producing A and the other client producing B residues. In which case you would expect that all 4 tests are correct.

  17. #17
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    E, you probably have a machine which is producing garbage like myself.

    Luckly the only machines I have on prp are borged machines doing secondpass. The funny thing is the bad machine was either a quad processor Dell server, a compaq desktop, or a sony viao laptop. Those are the only computers I have running under the 129.x.x.x IP, all are being run at factory specs with factory components. I'd like to know which one , the server would be

    With a 30.84% error rate I'm almost thinking it's one processor on the server. Scarry considering it has ecc raid-5 etc. etc. etc.

  18. #18
    Originally posted by vjs
    E, you probably have a machine which is producing garbage like myself.

    Luckly the only machines I have on prp are borged machines doing secondpass. The funny thing is the bad machine was either a quad processor Dell server, a compaq desktop, or a sony viao laptop. Those are the only computers I have running under the 129.x.x.x IP, all are being run at factory specs with factory components. I'd like to know which one , the server would be

    With a 30.84% error rate I'm almost thinking it's one processor on the server. Scarry considering it has ecc raid-5 etc. etc. etc.
    I guess I have to say it again.. dubious and ERROR are *NOT* the same..

  19. #19
    Senior Member engracio's Avatar
    Join Date
    Jun 2004
    Location
    Illinois
    Posts
    237
    Well I know I am a rock sometimes and it takes me a couple iteration before I say huh? It seems to me, unless the big BAD SOB admin send me a nasty email stating I be a BAD boy and to stop and desist immediately.I won't and I will keep on going like the pink ear bunny . Your welcome for my contribution to the project, it is my pleasure. Off I go back to my hole.


    e:
    Last edited by engracio; 11-03-2005 at 07:06 AM.

  20. #20
    Originally posted by Joh14vers6
    Alien 88,

    Could give the breakdown of dubious tests by user id ordered by percentage?
    Any word on this or did I totally miss it? If I did, please scold me and point me to the right place.

    I dont think I was listed but I was curious if I could search my User ID for bogus tests because, by my estimation, and depending on the time-frame in question, I might make up a significant component of the 374 tests (of which 112 were bad) from 24.x.x.x, which is a problem IP.

    Those only appear to be top 20+- lists noted above. I would like to know if I have submitted 100 bogus test for instance (just out of the top 20 by userID), and from which machine(s).
    I have 3-4 machines running SOB off-and-on at two different IPs, and I might be able to tell by where and when bogus tests are submitted whether or not I have a suspect computer. I could have one computer submitting nothing but junk, which I would obviously want to rectify.

    Is there a complete general list I can browse, or a way to check my submissions?

    Thanks!

  21. #21
    Originally posted by kelman66
    Any word on this or did I totally miss it? If I did, please scold me and point me to the right place.

    I dont think I was listed but I was curious if I could search my User ID for bogus tests because, by my estimation, and depending on the time-frame in question, I might make up a significant component of the 374 tests (of which 112 were bad) from 24.x.x.x, which is a problem IP.

    Those only appear to be top 20+- lists noted above. I would like to know if I have submitted 100 bogus test for instance (just out of the top 20 by userID), and from which machine(s).
    I have 3-4 machines running SOB off-and-on at two different IPs, and I might be able to tell by where and when bogus tests are submitted whether or not I have a suspect computer. I could have one computer submitting nothing but junk, which I would obviously want to rectify.

    Is there a complete general list I can browse, or a way to check my submissions?

    Thanks!
    No no no no no!!

    dubious doesn't mean 'bogus' or 'bad' or 'error rate'. Please see the definitions above!

    I may be able to show it via user ID sometime this weekend, but no guarantees..

  22. #22
    Basically what you're saying is that if I do a test and return residue A and then someone else does the test again via secondpass and has an error and produces residue B then BOTH tests will be marked as dubious.

    Therefore those people who have done more tests are more likely to have more dubious tests as there tests are more likely to be paired with other people's bad tests, i.e. they could have a lot of dubious even if all of their tests were perfect.



  23. #23
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Originally posted by Alien88
    I guess I have to say it again.. dubious and ERROR are *NOT* the same..
    Yes I understand...

    I'm not concerned about this:
    Code:
       5202                       1268       391       30.84%
    Since I've been running secondpass exclusively for a couple years

    What I am concerned about is this:
    Code:
    129.x.x.x                  441       395       89.57%
    dubious rate approaching 90% you can see my concern. Like E I'll continue my contributions as well at least until I get a nasty e-mail saying WHOA DUDE!!!

  24. #24
    Originally posted by vjs
    Yes I understand...

    What I am concerned about is this:
    Code:
    129.x.x.x                  441       395       89.57%
    dubious rate approaching 90% you can see my concern. Like E I'll continue my contributions as well at least until I get a nasty e-mail saying WHOA DUDE!!!
    It says still nothing and could be a coincidence.

  25. #25
    Senior Member
    Join Date
    Jun 2005
    Location
    London, UK
    Posts
    271
    Even if a machine is producing BOGUS tests 90% of the time it is still producing the correct result 10% of the time.

    If the machine produces an incorrect test then it has to be tested somewhere else until we get a matching residue, no loss there.

    So there is still a gain to be had from machines producing bogus tests doing second-pass. No matching result, someone else should pick it up and do it.

    Obviously if a machine is producing lots of errors then I'd like to know if it was one of mine as I'd like to get it looked at and get broken CPU/memory replaced.

    Machines producing definately BOGUS first-pass tests could be a problem but then that is why we do second-pass double checking. All machines are susceptible to random errors but it may be better if known dodgy machines are churning out first-pass rubbish.

    All of my machines, including the P4 I'm sceptical of (hence my 21% DUBIOUS results) are doing second-pass so there should be no negative effect.

    I am assuming two things:

    1) A machine doesn't get assigned a test it had previously been assigned although this isn't a problem if:-
    2) It doesn't produce the same incorrect residue if it does.

  26. #26
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Thank-you Greenbank your obviously understanding what I'm saying,

    The other possibility is that I've been doing secondpass for a very very very long time. Perhaps the results don't match b/c the orginal residue is non-existant or a different residue ???version??? from client 1.x.x etc.

    I'm only worried its a borged box running the fubared client. I'm not terribly concerened since they are secondpass etc.

    You point of running bogus boxes on secondpass is the way to go, yup. The possibility of matching bogus results is near zero and not worth considering project wise, I think this applies to E and wilden as well, are also in the secondpass boats.

  27. #27
    Any new news on the errorrates now almost up to 5M is finished.

  28. #28
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    What does everyone think about the need for availability of a PRP results file?

    I guess there would be some people willing to do some cruncing and come up with useful ideas.

    What info should it contain? The more the better, of course. I feel like it should contain

    k
    n
    client version
    residue
    user id
    IP address
    date assigned
    date returned

    data at the minimum.


    Probably something like, say monthly updates (like first day of each month) or so on the file would be sufficient to provide us with sufficient data on if anything goes wrong or not, and it will provide sufficient time to project coordinators in case of a new prime.


    So, what do you think? Any comments?

  29. #29
    Originally posted by Nuri


    k
    n
    client version
    residue
    user id
    IP address
    date assigned
    date returned

    I think with all this we get a data privacy conflict. The publication of IP adresses in particular I would consider as an offense.
    Anonymisation should absolutely be done.

    As for the publication of the residue, I see there no sense, as they are sort of pseudo random variables, and making them public would just mean to give the opportunity to misleaded folks to submit residues without test.

    I think if ever there is somebody interested, for writing a thesis about Distrubted Computing or so, he can ask for the data and will get it, through personal interaction with the admins (so they know is name etc.).

    As for our interest of crunching the data and find bad machines, make statistics ect.,
    k
    n
    client version
    the last two digits of the residue
    user id (anonymized)
    eventually IP address(anonymized)
    date assigned
    date returned

    should be sufficient, if necessary. (!?!)
    Though, I doubt there is so much interest making worth the work. But...
    Yours H.

  30. #30
    hhh is right on with his comments!
    Making public rerords of IP addresses and usernames linked to specific tests is a definite no-no. Many people, myself included, are feeling very strongly about privacy issues.
    A negative too for making residues public, it will make it dead easy to forge results, for anyone wishing to wreak havoc on the project.

    Some extra comments from me:
    The IP address is mostly irrelevant if your ISP uses dynamic IP. For example, I have over 50 IPs on record at the site, they all correspond to 3 PCs only and there's no other way except looking at the logs of each machine to find out which one submitted a specific residue, all of them are connected using the same ISP and an ever-changing IP (within a range, but still...) It might of course be more useful for those of you using static IPs, but definitely not for extracting generic project-wide data. I have found out that my DSL even occasionally (on average every 1-2 days of continuous online presence) changes IP automatically while operating, so I can very well begin a test with one IP and finish it with a quite different one, making useful data extraction and linking of tests to specific machines difficult or impossible.

    And perhaps making public only 2 digits of the residue is not accurate enough too, not too unlikely to be able to match 2 last digits by chance. Perhaps a good idea will be to make more public digits (say, last 4-5 digits) or present the residue in the form of D**2*C1***8*A**5, ie more than half of the digits replaced by an asterisk, whichever sounds more reasonable or easy to implement.

    Finally, even a perfectly legitimate use like writing a thesis, would have to be considered very carefully by the admins, before they share any personal data like the IPs. No such agreement is made when you join the site, so anyone obliging to this would have a strong position against the team, not worth the risk IMHO, unless consent is given first by the users.

    Perhaps, everyone should be able to access his personal data only, ie all the info Nuri posted about but only logged in through the personal preferences page.

    I hope all these make sense

  31. #31
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Agreed almost all around...

    I don't think userID is private information you can easily match a persons userID with their account name and e-mail address (if you made it public) using the server currently.

    On IP address... absolutely... never the whole address, however I think the last 3 digits would work just fine. (userID and the last 3 digits don't mean much)



    Residues never never the full residue

    On residues you only need the last 3 characters, the chances of the last 3 not matching are probably good enough for us. Thesis might be different story.

  32. #32
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    Fair enough, thx for the comments.

  33. #33
    I love 67607
    Join Date
    Dec 2002
    Location
    Istanbul
    Posts
    752
    It seems like, we have at least two residues for more than 99% of tests below 5m and for more than 80% of the tests between 5m and 6m, as of now.

    Alien88, would you consider an update on error rates?

  34. #34
    Member
    Join Date
    Dec 2002
    Location
    Eugene, Oregon
    Posts
    79
    It seems to me that the error rate pretty much determines how far second pass should lag behind first pass. Suppose the error rate is e, and current second pass exponents are a fraction f of first pass exponents, so for example, if f=.6, we would be doing second pass at 6M simultaneously with doing first pass at 10 M. The expected amount of work to test all exponents up to n is approximately proportional to n^3. The expected amount of work to test all exponents up to n would then be:

    (1-e)(n^3 + (fn)^3) + e((n/f)^3 + n^3) = n^3(1+ (1-e)f^3 + e/f^3).

    Minimizing this function for fixed f then leads to f = (e/(1-e))^(1/6).

    If e = .08, this gives an optimal f of about .665, so one would expect second pass to be at exponents about 2/3 the size of first pass exponents. For e=.015, f should be closer to .5, and this is about the level of the GIMPS double-checking effort. By time e reaches .25, f should be about .794, and for e equal to .50 (a 50% error rate), the optimal f rises to 1.

  35. #35
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Looks good Phil the only difference between Mersenne and SoB however is that we don't really need to test every n. In other words we just have to find a prime k/n pair.

    Not discounting the accuracy of the above equation but there are a few minor things that also need to be considered.

    Since we stop once we find a prime for a particular k, this leads to is lost work from doing secondpass testing when a firstpass test yeilds a prime and vice versa. Not sure how this effects your equation.

    Second lower n tests are more likely to be prime than firstpass tests due to prime denisty. This could be easily factored into your equation.

    Also we have a fairly substaintial sieve and factoring effort which need to be considered.

    I'm not discounting your equation, I'm sure the above considerations may only effect the, f, by a few percent. I may graph your function of e vs f later this week and post it here, nice work and thanks. I just though you may wish to optimize it further with prime denisty considerations?

  36. #36
    Member
    Join Date
    Dec 2002
    Location
    Eugene, Oregon
    Posts
    79
    Thanks, your suggestion of factoring in prime densities is an excellent one. As for the "wasted work" issue, that is what I was trying to model with the equation. I made the assumption that there was a prime yielding exponent at the value n and tried to model the amount of work it would take to find it as proportional to:

    (1-e)(n^3 + (fn)^3) + e((n/f)^3 + n^3)

    (1-e) is the probability that the exponent is found in first pass, e is the probability that it is found in second pass. If the exponent is found in first pass, (fn)^3 represents the amount of work "wasted" in doing second pass tests, while if the exponent is found in second pass, (n/f)^3 represents the amount of work "wasted" in doing first pass tests. (Of course, I am leaving out a proportionality constant in all this.) But certainly the possibility that there is a second prime yielding exponent just a little bit higher is one I haven't taken into account.

    One thing I like about this model is that it allows changing f in response to change in e. For example, e could be effectively lowered through identifying tests with errors, or tests from machines which have proven less reliable in the past, and reissuing those test in the first-time queue. Lower e would then lead to a lower f which would then result in more emphasis on first pass. If f = 1/2, the work done in double-checking only represents 1/8 of the work done doing first time tests, so double-checking represents about 11% of the total effort. If f = 2/3, then the work done in double checking represents 8/27 of the first time work, or 8/35 of the total, about 23% of the total effort.

    Thanks for your thoughtful comments!

  37. #37
    Moderator vjs's Avatar
    Join Date
    Apr 2004
    Location
    ARS DC forum
    Posts
    1,331
    Phil,

    Thank-you for the more detailed explaination, your equation is making alot more sence to me now. Personally I wouldn't worry about the possibilities of two primes, i.e. a firstpass missing the first occurance followed by finding a second occurance before secondpass reveals the first. (Phew... hope that makes sence).

    I think the likelyhood of something like this would be rare as long as secondpass doesn't get too low.

    Reflections on factoring:
    I don't think factoring, P-1, makes any difference to your equation since P-1 is generally done before firstpass testing. Also P-1 factoring shouldn't be done below the firstpass level or "submitted late" anyways.

    Sieve on the other hand does make a difference since there is a window where sieve will eliminate a test once it is firstpassed.

    In any case, sieve, prime density, etc I believe their contribution is less than the error in error rate determination.

    Graphing your function... a interesting graph y is f (secondpass n divided by firstpass n)
    x is the error rate e, (0.01 is a 1% error rate).

    I think this graph aids your explaination from before, and your points.

    - Secondpass n half the size of a firstpass n is only good for about 1.5% error rate.
    - A more reasonable (possibly over-estimate) of the error rate 10% would require a secondpass level 70%. (Which is about the current second vs first pass level)

    Well done Phil, gives us alot of food for though.

    By the time we reach n=12M for firstpass, secondpass should be 8M but no more than 8.5M.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	fvse.gif 
Views:	66 
Size:	3.0 KB 
ID:	919  

  38. #38
    Old Timer jasong's Avatar
    Join Date
    Oct 2004
    Location
    Arkansas(US)
    Posts
    1,778
    For those of you interested in calculating the value of test in relation to other things(like sieve, p-1 and secondpass) I had a thread in Riesel Sieve. Unfortunately, the sub-forum it's in seems to have disappeared(another sub-forum did this a while back, it was an accident). If the sub-forum comes back I'll put a link here.

    Word of advice: When I post the thread, readers would probably get more out of the experiece by reading the posts in reverse order. I made some screwups in the beginning, but I believe I got it right in the end.

    edit: Here it is.
    Last edited by jasong; 02-27-2006 at 06:44 PM.

  39. #39
    i believe yo uare making it unnescessarily complex. All you need to d ois determine which is most likely to produce a prime at anygiven level. for second pass this is exactly the (error-rate)(likelyhood of prime)(percent of ammount of time it takes for fulltest compared to firstpass). Make this value equal to firstpass likelyhood of a prime and yo uhave the correct level. Of course these are non-static values which i am incapable of determineing. This does to me seem the most apropriate way of starting however. Test density as effected by P-1 adn sieveing effects the likelyhood of a test returning a prime for all tests in the sieved range therefor it is unnecessary to include i nthe overall equation. Only in a small part of it which has been done before in other threads. I believe all of these values have been deterimined before and should be relativly easy to find in the threads.

  40. #40
    Member
    Join Date
    Dec 2002
    Location
    Eugene, Oregon
    Posts
    79
    I tried to incorporate the suggestions of vfs and am trying to understand why I am now getting a different answer. Not that I wouldn't expect a different answer, but I am getting a smaller f instead of a larger one, whereas, I would think that taking into account the fact that a small n test is more likely to lead to a prime, I was expecting a larger f.

    The assumptions of the first model were that there was a prime k*2^n+1 at a certain n, and I was trying to minimize the expected amount of time it would take for this prime to show up either at first or second pass.

    Now I am just assuming that a test of k*2^n+1 takes time proportional to n^2*ln n, and that the probability of this number being prime is proportional to 1/ln(k*2^n+1), or 1/n, with a proportionality constant which depends upon how far factors have been sieved for the candidate. Therefore, I am assuming in this model that first and second pass candidates have been sieved to the same general overall level.

    Therefore, the probability per unit time of finding this candidate k*2^n+1 prime is proportional to 1/(n^3*ln n). Overall, the ln n factor makes little difference in finding the optimal strategy.

    So suppose e is the proportion of first pass tests which are in error, and suppose that second pass lags behind first pass by a factor of f < 1. The probability per unit time that a first pass test finds a prime is then approximately proportional to (1-e)/n^3 where n is the current range of testing. If second pass tests are taking place on exponents of the size f*n, the probability per unit of time of a second pass test finding a prime is then proportional to e/(fn)^3. Equating the two, (and ignoring the ln n factor), we see that the optimal size of f, where first and second pass tests are equally likely, per unit of time, to find a prime, we must have:

    f = (e/(e-1))^(1/3)

    Note the exponent of 1/3 instead of 1/6 ! This analysis says that the optimal f is the square of the f in my original model, and therefore, smaller, i.e., for e=.06, f is .45 instead of .66, for e=.015, f is .25 instead of .5, leading to much lower second pass ranges. The paradox is that we are now taking into account that smaller n's are more likely to yield primes, which one would think would increase the importance of second-pass tests, rather than decrease them. On the other hand, because we would now be happy with a larger n prime if the smallest n happened to have been missed in first pass, that seems to argue for emphasizing first pass more.

    Am I missing anything here? I'd appreciate it if anyone notices something I have overlooked. Thanks again for your comments.

    Two more points: I don't think P-1 changes anything, as P-1 is just a cost effective way of eliminating some tests. In addition, these primes are rare, so in most cases, we would expect if a first pass prime were missed, nothing would be discovered until second pass catches up to it. So why do the two models make such a different prediction about how far behind first pass would be optimal second pass tests?

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •