Error Rates Thus Far

**Alien88** · 11-01-2005, 12:38 PM

Code:

Seventeen or Bust error rate report
   Tue Nov  1 12:32:09 EST 2005

pulling prp test result history from database: ok

distinct k/n pairs with prp results: 330789
k/n pairs with multiple prp results: 118338
number of dubious tests: 15067
number of bad tests (best guess): 8212

breakdown of dubious tests by client version:

                         prp tests   dubious      dubious
   version              considered     tests   percentage
   ------------------   ----------   -------   ----------
   2.0 TEST                     18        18      100.00%
   2.2.1 T2                      1         1      100.00%
   2.2.1 T3                      1         1      100.00%
   2.2.1 T4                      1         1      100.00%
   2.2.1                         8         3       37.50%
   2.0 TEST 6                    6         2       33.33%
   2.2                        3363       774       23.02%
   2.0                          26         5       19.23%
   2.0 SSE2                   7923      1335       16.85%
   2.0 TEST 5                   33         4       12.12%
   2.5.0                     20560      1754        8.53%
   2.4.0                     24133      2056        8.52%
   1.2.1                        39         3        7.69%
   2.3.0                     28481      2030        7.13%
   1.0.0 SMP - for MathGuy         7859       530        6.74%
   1.2.3                        63         4        6.35%
   2.0 TEST 3                  710        44        6.20%
   1.1.1                       841        52        6.18%
   1.1.0                     21139      1130        5.35%
   1.0.0                     76585      3964        5.18%
   0.9.9                      1230        57        4.63%
   1.2.5                     15951       600        3.76%
   2.4.0 TEST                 1196        37        3.09%
   2.0 TEST 7                  245         7        2.86%
   1.0.2                     16377       423        2.58%
   0.9.7                      3381        82        2.43%
   1.0.1                       210         4        1.90%
   1.2.0                      9837       144        1.46%
   0.9.8                       110         1        0.91%
   1.2.2                       537         1        0.19%

breakdown of dubious tests by user id:

                         prp tests   dubious      dubious
   user id              considered     tests   percentage
   ------------------   ----------   -------   ----------
   5202                       1268       391       30.84%
   7192                       1683       343       20.38%
   3                         31852       245        0.77%
   1800                       2092       232       11.09%
   3158                       2827       216        7.64%
   631                        4273       212        4.96%
   7092                        727       160       22.01%
   5584                       3059       152        4.97%
   8141                        626       136       21.73%
   527                         948       134       14.14%
   6308                        550       133       24.18%
   1143                        549       131       23.86%
   589                         406       129       31.77%
   6351                       3173       128        4.03%
   5965                       1996       121        6.06%
   1154                        549       119       21.68%
   2277                        483       113       23.40%
   1269                        741       109       14.71%
   4713                       2253       109        4.84%
   8035                        714       107       14.99%

breakdown of dubious tests by ip address:

                         prp tests   dubious      dubious
   ip address           considered     tests   percentage
   ------------------   ----------   -------   ----------
   129.x.x.x                  441       395       89.57%
   69.x.x.x                  5843       389        6.66%
   67.x.x.x                   487       185       37.99%
   62.x.x.x                   1408       179       12.71%
   62.x.x.x                  7751       164        2.12%
   68.x.x.x                    732       160       21.86%
   66.x.x.x                   552       133       24.09%
   24.x.x.x                    374       112       29.95%
   220.x.x.x                 2725       103        3.78%
   216.x.x.x                 7010        97        1.38%
   65.x.x.x                    318        89       27.99%
   62.x.x.x                   491        88       17.92%
   70.x.x.x                  2043        87        4.26%
   4.x.x.x                     813        79        9.72%
   12.x.x.x                    329        76       23.10%
   67.x.x.x                  509        72       14.15%
   131.x.x.x                 151        69       45.70%
   205.x.x.x                5844        69        1.18%
   204.x.x.x                  298        68       22.82%
   12.x.x.x                 159        67       42.14%

breakdown of dubious tests by assignment time:

                         prp tests   dubious      dubious
   assignment time      considered     tests   percentage
   ------------------   ----------   -------   ----------
   2002-Dec                  21420       817        3.81%
   2002-Nov                   6676        87        1.30%
   2003-Apr                   8013       619        7.72%
   2003-Aug                   4221       274        6.49%
   2003-Dec                   1971        51        2.59%
   2003-Feb                  10071       869        8.63%
   2003-Jan                  14908      1040        6.98%
   2003-Jul                   8258       445        5.39%
   2003-Jun                  12104       443        3.66%
   2003-Mar                  22696       725        3.19%
   2003-May                  11526       462        4.01%
   2003-Nov                   1517        54        3.56%
   2003-Oct                    855        45        5.26%
   2003-Sep                   1079        60        5.56%
   2004-Apr                   2527        34        1.35%
   2004-Aug                   2143       132        6.16%
   2004-Dec                   4701       899       19.12%
   2004-Feb                   3024        18        0.60%
   2004-Jan                   2818        28        0.99%
   2004-Jul                   2353        74        3.14%
   2004-Jun                   5721        76        1.33%
   2004-Mar                   1179        28        2.37%
   2004-May                   3197        71        2.22%
   2004-Nov                   3015       348       11.54%
   2004-Oct                   2792       349       12.50%
   2004-Sep                   2643       180        6.81%
   2005-Apr                   3544       236        6.66%
   2005-Aug                   4226       402        9.51%
   2005-Feb                   6664       466        6.99%
   2005-Jan                   7247       805       11.11%
   2005-Jul                   4871       399        8.19%
   2005-Jun                   5293       349        6.59%
   2005-Mar                   4912       424        8.63%
   2005-May                   4207       253        6.01%
   2005-Oct                  35258      3135        8.89%
   2005-Sep                   3260       370       11.35%

breakdown of dubious tests by n range:

                         prp tests   dubious   bogus    estimated
   n range              considered     tests   tests   error rate
   ------------------   ----------   -------   -----   ----------
                               803        72      37        4.61%
   0.0M, 0.5M                36677        73      65        0.18%
   0.5M, 1.0M                30952       167     154        0.50%
   1.0M, 1.5M                24920       728     536        2.15%
   1.5M, 2.0M                24507      1152     606        2.47%
   2.0M, 2.5M                24526      1673     860        3.51%
   2.5M, 3.0M                24864      2190    1122        4.51%
   3.0M, 3.5M                23808      2081    1074        4.51%
   3.5M, 4.0M                20916      2022    1059        5.06%
   4.0M, 4.5M                14201      1318     697        4.91%
   4.5M, 5.0M                 1515       112      84        5.54%
   5.0M, 5.5M                 1190       146      88        7.39%
   5.5M, 6.0M                 1000       136      79        7.90%
   6.0M, 6.5M                 1508       255     156       10.34%
   6.5M, 7.0M                 1565       533     309       19.74%
   7.0M, 7.5M                 1323       630     343       25.93%
   7.5M, 8.0M                 2783      1209     635       22.82%
   8.0M, 8.5M                 1436       230     125        8.70%
   8.5M, 9.0M                 1377       185     102        7.41%
   9.0M, 9.5M                 1044       155      81        7.76%

**vjs** · 11-01-2005, 01:32 PM

MY GOD,

I have one of the highest error rate machines on prp!! Could you tell me in a PM what the 129.x.x.x IP is? I'm going to take that machine off line asap.

Looks like I'm in good comany for high error rates. E, Wilden, we also have a bad machine or two.

An average error rate of at least 7% from hear on, about 1 in 14 tests.

Quick calculation for time firstpass vs secondpass should probably be in the area of, what 1 to 12 or less???

Complete 12 secondpass tests in the time it takes to complete a firstpass.

This is certainly poetic justice, for myself.

Also what is the difference between dubious and bogus.

**Joh14vers6** · 11-01-2005, 02:54 PM

Alien 88,

Could give the breakdown of dubious tests by user id ordered by percentage?

**vjs** · 11-01-2005, 04:05 PM

Yes or even better yet % by IP with userID. This would help us try to identify problem machines, I'm not sure if posting full IP adresses is a great idea, perhaps the first three and last three digits? Or the last three digits and the userID.

Also alot of these lines are not real error rates they are selected error rates which are artifficially high, correct?

Example

Ver 2.5.0, you only considered 20560 tests of which 1754 were dubious a 8.53% dubious rate. I'd think that we tested alot more tests with version 2.5.0 than 20560 total tests.

------------------------------------------------------------------

I'm still curious about the definitions of; PRP tests considered ? Dubious? Bogus?
You guys have actually done a great job picking out errors.

Are these decent definitions to the following terms???

PRP tests considered, tests which meet a certain criteria, be it..

- exceptional long time
- unassigned
- from error prone user
- mismatching client versions
- Mismatching residues
- etc

I have a feeling the answer to this is simply, "Yes all of the above and then some depending one what your looking at."

Dubious test,
Mismatching residues gives both tests a dubious test score.
Dubious = Mismatching residues for the same k/n pair

Bogus
Three tests ... two matching (correct tests) the third is a Bogus test (Confirmed error test).
Bogus = Multiple residues, at least two match, the mismatch is a bogus.

I have a feeling these are the defintions correct me if I'm wrong.

----------------------------------------------------------------------------

If the above is correct, I would think these two lines basically show the current error rate.

Code:

                         prp tests   dubious      dubious
   assignment time      considered     tests   percentage
   ------------------   ----------   -------   ----------
   2005-Oct                  35258      3135        8.89%
   2005-Sep                   3260       370       11.35%

Especially Oct05, I would assume these are all of the secondpass tests completed during the month of October. Of these completed tests 8.89% of the residues did not match the previously obtained residues. These error are either client or user based. We also don't know if the incorrect residue occured during the first or secondpass test since there were only two tests performed.

If true this basically means we have a ~0.99 confidince level with all tests n<4M. But it also means we have a ~1% chance we have still missed a prime n<4M not bad but not great. Good news is we would only have to recheck 1% of the tests or less b/c some have already been tripple checked.

Code:

                         prp tests   dubious   bogus    estimated
   n range              considered     tests   tests   error rate
   ------------------   ----------   -------   -----   ----------
                               803        72      37        4.61%
   0.0M, 0.5M                36677        73      65        0.18%
   0.5M, 1.0M                30952       167     154        0.50%
   1.0M, 1.5M                24920       728     536        2.15%
   1.5M, 2.0M                24507      1152     606        2.47%
   2.0M, 2.5M                24526      1673     860        3.51%
   2.5M, 3.0M                24864      2190    1122        4.51%
   3.0M, 3.5M                23808      2081    1074        4.51%
   3.5M, 4.0M                20916      2022    1059        5.06%

--------

Edit, Sorry it didn't dawn on me at first that you divide up the n-range into 0.5M sections (~12K tests per 0.5M n-range). I guess the estimated error rates are probably pretty close to reality. However I do have a hard time believing in the ~20%'s around 7-8M... 1 in 5 tests... :shocked:

**chris** · 11-01-2005, 07:03 PM

Originally posted by vjs
However I do have a hard time believing in the ~20%'s around 7-8M... 1 in 5 tests... :shocked:

I don't. I'm pretty sure that it has to do with this (excerpt from the news page):

<--

PRIME !! PRIME !! PRIME !! PRIME !!
Sunday, 02 Jan 2005

28433 * 2^7830457 + 1 is prime!

Bug fixed - Windows Upgrade Critical - Also Linux Upgrade
Friday, 24 Dec 2004

A very critical bug was found by KenG6 of Team Anandtech and has now been fixed.

v2.2 Client -- Algorithmic Upgrade for All
Saturday, 11 Dec 2004

v2.0 client for SSE2 processors (FASTER!)
Friday, 29 Oct 2004

-->

Copied from Alien88 post: (last number per line is the error rate)

2004-Dec 4701 899 19.12%

2.0 TEST 18 18 100.00%
2.2.1 T2 1 1 100.00%
2.2.1 T3 1 1 100.00%
2.2.1 T4 1 1 100.00%
2.2.1 8 3 37.50%
2.0 TEST 6 6 2 33.33%
2.2 3363 774 23.02%
2.0 26 5 19.23%
2.0 SSE2 7923 1335 16.85%

If you do have a look at the breakdown of dubious tests by n range, and keep in mind the buggy 2.x clients, i think we have an interesting picture:

The error rate climbs relativly smooth with the growth of n, from

(roughly rounded)

0.35 % for 1M
2.3 % for 2M
4.0 % for 3M
4.75 % for 4M
5.3 % for 5M
7.5 % for 6M (already influenced by buggy client? )

and is around 8-9 % now (but hard to tell, because most tests have not been secondpassed for n > 6 M right now)

Something else to consider:

Firstpass test that missed the prime was done at: around summer 2003.
Prime finally found: October 2005.

Difference: 26-28 month

Special unfortunate thing to notice: k=4847 was a relatively dense series that required more tests in a given range than the average k.

Special fortunate thing to notice: When secondpass was at 2.7M, the creaters decided to commit all ressources to secondpass

Estimated time secondpass would have discovered the missed prime without this efford: Probably (late?) in 2006

So, when we look at how secondpass was handled in the past, 2 questions arise:

Month of project time wasted: ??
Month of project time not wasted: ??

The two extremes we are facing is:

- doing double work by rechecking relatively close to firstpass (but knowing for sure (as sure as it can get) about the result)
- doing no secondpass at all and hope for the next prime to come quicker, while sacrifying any security about the results that are sent back being correct.

We should keep in mind that:
- the tests are getting larger and larger
- the error rate, although still low, seems to grow (smoothly) with tests getting bigger
- the density of primes at higher n gets lower

It might be helpful to discuss a 'policy' for secondpass, so the projects computational power is used in an optimal way.

**Nuri** · 11-01-2005, 07:11 PM

Apart from the buggy 2.x client, I guess the error rates for n>4.5m are skewed upwards as the data for those ranges contain dropped (and reassigned) tests, where the first test was returned later.

I guess it would be interesting to see the breakdown of error rates from the time to complete perspective as well. I feel like something like,

0 to 10 days
10 to 30 days
30 to 60 days
60 to 90 days
90 days or more

would be interesting to see.

If, for example we can see an obvious increase in error rates as the duration increases, we might implement an early warning system for first time tests, say reassign immediately as soon as a test takes longer than x (90?) days to complete.

**Keroberts1** · 11-01-2005, 07:27 PM

when the new client comes out it should be easy to have the test automatically report back its status every week or so. If we save intermittant status updates then we can determine where the error was when mismatched residues are found. We could even run tw osets of every test simultaneously andhave a guarenteed 0% error rate because everytime an error occures it would automatically back up a few steps to hen it knew it was o nsolid ground and continue from there.

**Ken_g6[TA]** · 11-01-2005, 09:10 PM

I wonder:

I seem to recall that running the SSE2 enabled algorithm produced a different residue than the non-SSE2 algorithm (at least for awhile). Am I remembering correctly, and did you all take this into account?

Also, I thought I caught the bug mentioned above fairly quickly. Why would the error rate remain so stable over such an apparently long period of time?

**Alien88** · 11-01-2005, 10:25 PM

I want to bring up a previous point when we discussed error rates before:

NOTE THAT THE "DUBIOUS PERCENTAGE" COLUMN IS NOT THE SAME AS ERROR RATE!!!!!!!!

Firstly, remember that measured error rates above the double-check threshold are completely unreliable. This is because, above that threshold, the only way for a test to have been double-checked is if the first test was dropped, handed out again, reported, and then the original dropped client reappeared and submitted its test, too. This implies the first client was either running for a very, very long time, or that it was stopped and restarted with long lags inbetween runs. This kind of behavior is almost certainly MUCH more likely to result in errors. We won't know what the REAL numbers are for a given range until we've done systematic double checks in that range.

Secondly, it's important to realize what a "dubious test" is. A dubious test is any test where we're not confident the residue is correct. If we had five tests, with residues A, A, A, B and C, the last two (with residues B and C) would both be considered dubious. The reason this is NOT the same as "error rate" is that if there are only two tests with two different residues, BOTH TESTS are considered dubious. But most likely, only one of them is truly wrong. It's just that they're both under suspicion until we can do a third test to confirm one of the residues.

MISINTERPRETING THESE NUMBERS can lead to headaches, ulcers, stroke, or premature death! Read on at your own risk!!

**Nuri** · 11-02-2005, 05:25 AM

Time for a comparison!!!

Originally posted by kugano
Haha! You're all going to beat me over the head for this, and I deserve it. The error rate report I posted earlier (the one that only lists the n breakdown) is completely fubar'd and utterly wrong. When I was programming the report generator, I was using a small subset of the data (50,000 prp tests) to test it. When I did the final run and posted the results in this thread, I forgot to change back to the full dataset! There are really about 325,000 prp tests to consider, so the numbers in my earlier post were only based on about 15% of the actual data!!!

Here's some MUCH BETTER data w/ estimated error rate, considering all 325,000 prp tests, including the 100 random tests I assigned yesterday.

Note the by-user breakdown. It appears that that some specific users are experiencing extremely high error rates (probably horribly overclocked or otherwise unstable machines). It's starting to look like a case of "5% of the users are responsible for 95% of the total errors," just like world wealth =/

Also note that the very first report I posted, the one that shows all breakdowns, is correct, using all data... it's just the second one (when I posted the extra "estimated error rate" column) that's broken.

Code:

Seventeen or Bust error rate report Fri Jan 7 13:03:47 EST 2005 pulling prp test result history from database: ok distinct k/n pairs with prp results: 271138 k/n pairs with multiple prp results: 46338 number of dubious tests: 2851 number of bad tests (best guess): 1735 breakdown of dubious tests by client version: prp tests dubious dubious version considered tests percentage ------------------ ---------- ------- ---------- 2.0 TEST 18 18 100.00% 2.2.1 T2 1 1 100.00% 2.2.1 T3 1 1 100.00% 2.2.1 T4 1 1 100.00% 2.2.1 8 3 37.50% 2.0 TEST 6 6 2 33.33% 2.0 26 5 19.23% 2.0 SSE2 5117 705 13.78% 2.0 TEST 5 33 4 12.12% 2.2 1452 145 9.99% 1.2.1 37 3 8.11% 1.1.1 757 49 6.47% 2.0 TEST 3 703 45 6.40% 1.2.3 61 3 4.92% 2.2 TEST 3 28 1 3.57% 1.2.5 14876 513 3.45% 2.0 TEST 7 244 7 2.87% 1.0.1 175 5 2.86% 1.1.0 13100 332 2.53% 1.0.0 32527 691 2.12% 2.3.0 1084 21 1.94% 1.2.0 9468 121 1.28% 0.9.9 590 6 1.02% 1.0.2 8939 90 1.01% 0.9.7 1478 13 0.88% breakdown of dubious tests by user id: prp tests dubious dubious user id considered tests percentage ------------------ ---------- ------- ---------- xxxx 30125 318 1.06% xxxx 33481 137 0.41% xxxx 213 53 24.88% xxxx 153 42 27.45% xxxx 484 41 8.47% xxxx 121 35 28.93% xxxx 104 33 31.73% xxxx 74 30 40.54% xxxx 76 29 38.16% xxxx 443 26 5.87% xxxx 199 23 11.56% xxxx 111 23 20.72% xxxx 206 22 10.68% xxxx 75 21 28.00% xxxx 99 21 21.21% xxxx 474 21 4.43% xxxx 132 21 15.91% xxxx 62 20 32.26% xxxx 1844 19 1.03% xxxx 61 18 29.51% breakdown of dubious tests by ip address: prp tests dubious dubious ip address considered tests percentage ------------------ ---------- ------- ---------- xxx.xxx.xxx.xxx 5699 73 1.28% xxx.xxx.xxx.xxx 242 58 23.97% xxx.xxx.xxx.xxx 728 57 7.83% xxx.xxx.xxx.xxx 193 51 26.42% xxx.xxx.xxx.xxx 691 51 7.38% xxx.xxx.xxx.xxx 4565 47 1.03% xxx.xxx.xxx.xxx 453 39 8.61% xxx.xxx.xxx.xxx 121 35 28.93% xxx.xxx.xxx.xxx 5170 33 0.64% xxx.xxx.xxx.xxx 76 29 38.16% xxx.xxx.xxx.xxx 43 24 55.81% xxx.xxx.xxx.xxx 501 22 4.39% xxx.xxx.xxx.xxx 405 21 5.19% xxx.xxx.xxx.xxx 50 18 36.00% xxx.xxx.xxx.xxx 90 17 18.89% xxx.xxx.xxx.xxx 48 17 35.42% xxx.xxx.xxx.xxx 16 16 100.00% xxx.xxx.xxx.xxx 334 12 3.59% xxx.xxx.xxx.xxx 108 12 11.11% xxx.xxx.xxx.xxx 327 12 3.67% breakdown of dubious tests by assignment time: prp tests dubious dubious assignment time considered tests percentage ------------------ ---------- ------- ---------- 2002-Dec 7283 358 4.92% 2002-Nov 6226 94 1.51% 2003-Apr 2736 96 3.51% 2003-Aug 1707 53 3.10% 2003-Dec 1931 56 2.90% 2003-Feb 582 24 4.12% 2003-Jan 1161 44 3.79% 2003-Jul 4384 78 1.78% 2003-Jun 8154 65 0.80% 2003-Mar 14449 54 0.37% 2003-May 7495 64 0.85% 2003-Nov 1494 66 4.42% 2003-Oct 803 63 7.85% 2003-Sep 936 62 6.62% 2004-Apr 2504 32 1.28% 2004-Aug 2090 124 5.93% 2004-Dec 3176 372 11.71% 2004-Feb 3006 14 0.47% 2004-Jan 2610 22 0.84% 2004-Jul 2270 72 3.17% 2004-Jun 5673 72 1.27% 2004-Mar 1158 29 2.50% 2004-May 3158 65 2.06% 2004-Nov 2730 324 11.87% 2004-Oct 2621 344 13.12% 2004-Sep 2578 186 7.21% 2005-Jan 1115 18 1.61% breakdown of dubious tests by n range: prp tests dubious bogus estimated n range considered tests tests error rate ------------------ ---------- ------- ----- ---------- 0.0M, 0.5M 36667 73 65 0.18% 0.5M, 1.0M 30335 183 149 0.49% 1.0M, 1.5M 12889 688 356 2.76% 1.5M, 2.0M 1131 51 35 3.09% 2.0M, 2.5M 1423 45 33 2.32% 2.5M, 3.0M 805 40 35 4.35% 3.0M, 3.5M 1111 95 71 6.39% 3.5M, 4.0M 1734 152 122 7.04% 4.0M, 4.5M 1651 149 114 6.90% 4.5M, 5.0M 1260 134 70 5.56% 5.0M, 5.5M 1021 134 70 6.86% 5.5M, 6.0M 903 128 73 8.08% 6.0M, 6.5M 1289 217 129 10.01% 6.5M, 7.0M 1103 410 230 20.85% 7.0M, 7.5M 553 293 153 27.67% 7.5M, 8.0M 155 59 30 19.35%

**vjs** · 11-02-2005, 11:03 AM

Thanks for explaining dubious vs bogus it makes a little more sence now.

It pretty obvious from the results that error rate is incresaing with time taken for a test. This only makes sence since it only takes one error during the entire test to make the test a bogus, longer tests greater chance of one error occuring.

I did a quick couple graphs of data trends etc, IMO at our current error rate we would expect to have a 10% error rate by 14M, with current hardware etc. This rate is a little high I'm thinking it should go down with faster hardware/computers of the future, etc.

It also suggests that those 20% error rates are not representivite of the project since they only occur over a small n-range.

Nuri and I suggested seeding the que a while back...

I still suggest that you populate the global que or high priority que with all 5M<n<8M for one or two of the lighter k's. This would tell us what to expect for the future, I also don't think it would take that long project wise. Perhaps 22699 and or 19249.

Can anyone check their logs for time required to complete a test,
I'm interested in how long it took you to complete a test when n=9.9M and how long it takes with current tests n=4.5M.

**MikeH** · 11-02-2005, 01:51 PM

Thanks for explaining dubious vs bogus it makes a little more sense now.

Err....where exactly is the description of a bogus test?

I see the explanation of difference between dubious test and error rate, but absolutely positively no use of the word "bogus" anywhere in that explanation!

**Joh14vers6** · 11-02-2005, 02:21 PM

Originally posted by MikeH
Err....where exactly is the description of a bogus test?

I see the explanation of difference between dubious test and error rate, but absolutely positively no use of the word "bogus" anywhere in that explanation!

Actually it is hidden. With two different residues from one test both are dubious. With three residues and one is different from the other two this one is bogus.

**engracio** · 11-02-2005, 06:26 PM

originally posted by Joh14vers6

Actually it is hidden. With two different residues from one test both are dubious. With three residues and one is different from the other two this one is bogus.

So is that mean I can re start all of my prp machines again after they completed their factoring/sieving? Or do I still have a bad machine somewhere? I hope it is located locally and not one of my borged machine around the country.

e

**Jwb52z** · 11-02-2005, 06:39 PM

Maybe I can explain this better. Dubious indicates a doubt as to the veracity or accuracy of a test. Bogus indicates that it is believed that the test results are bad and need to be redone. At least, that's the way I understand it.

**vjs** · 11-02-2005, 07:42 PM

Let me try one more time;

k/n pair was tested twice giving residues A and A.
Since residue A and A match they are both considered correct.

k/n pair was tested twice giving residues A and B.
Since residue A and B do not match both tests are considered dubious.

Why? Because of the following:
- Both A and B could be incorrect (Two incorrect tests)
- Second more likely possibility one of A or B is correct the other is incorrect (one correct one incorrect)

We don't know which one or if either are correct until a third test is performed, therefore they are both dubious until a third test.

Once a third test is conducted you probably get the following:

k/n pair was tested three times giving residues A, B, and A

- Residues A and A are considered to be correct, residue B is bogus

Now since two resides match there is no reason why a fouth test needs to be conducted, inother words two matching residues is good enough to say that that k/n pair is not prime.

---------------------------

A case that cannot happen.

k/n pair was tested multiple times producing A, A, B, B The possibility of have something like this happening is insanely small. The only time something like this could happen is if two different client were used one producing A and the other client producing B residues. In which case you would expect that all 4 tests are correct.

**vjs** · 11-02-2005, 07:50 PM

E, you probably have a machine which is producing garbage like myself.

Luckly the only machines I have on prp are borged machines doing secondpass. The funny thing is the bad machine was either a quad processor Dell server, a compaq desktop, or a sony viao laptop. Those are the only computers I have running under the 129.x.x.x IP, all are being run at factory specs with factory components. I'd like to know which one

, the server would be

With a 30.84% error rate I'm almost thinking it's one processor on the server. Scarry considering it has ecc raid-5 etc. etc. etc.

**Alien88** · 11-02-2005, 08:06 PM

Originally posted by vjs
E, you probably have a machine which is producing garbage like myself.

Luckly the only machines I have on prp are borged machines doing secondpass. The funny thing is the bad machine was either a quad processor Dell server, a compaq desktop, or a sony viao laptop. Those are the only computers I have running under the 129.x.x.x IP, all are being run at factory specs with factory components. I'd like to know which one , the server would be

With a 30.84% error rate I'm almost thinking it's one processor on the server. Scarry considering it has ecc raid-5 etc. etc. etc.

I guess I have to say it again.. dubious and ERROR are *NOT* the same..

**engracio** · 11-02-2005, 09:54 PM

Well I know I am a rock sometimes and it takes me a couple iteration before I say huh? It seems to me, unless the big BAD SOB admin send me a nasty email stating I be a BAD boy and to stop and desist immediately.I won't and I will keep on going like the pink ear bunny . Your welcome for my contribution to the project, it is my pleasure. Off I go back to my hole.

e:

**kelman66** · 11-02-2005, 11:53 PM

Originally posted by Joh14vers6
Alien 88,

Could give the breakdown of dubious tests by user id ordered by percentage?

Any word on this or did I totally miss it? If I did, please scold me and point me to the right place.

I dont think I was listed but I was curious if I could search my User ID for bogus tests because, by my estimation, and depending on the time-frame in question, I might make up a significant component of the 374 tests (of which 112 were bad) from 24.x.x.x, which is a problem IP.

Those only appear to be top 20+- lists noted above. I would like to know if I have submitted 100 bogus test for instance (just out of the top 20 by userID), and from which machine(s).
I have 3-4 machines running SOB off-and-on at two different IPs, and I might be able to tell by where and when bogus tests are submitted whether or not I have a suspect computer. I could have one computer submitting nothing but junk, which I would obviously want to rectify.

Is there a complete general list I can browse, or a way to check my submissions?

Thanks!

**Alien88** · 11-03-2005, 04:12 AM

Originally posted by kelman66
Any word on this or did I totally miss it? If I did, please scold me and point me to the right place.

I dont think I was listed but I was curious if I could search my User ID for bogus tests because, by my estimation, and depending on the time-frame in question, I might make up a significant component of the 374 tests (of which 112 were bad) from 24.x.x.x, which is a problem IP.

Those only appear to be top 20+- lists noted above. I would like to know if I have submitted 100 bogus test for instance (just out of the top 20 by userID), and from which machine(s).
I have 3-4 machines running SOB off-and-on at two different IPs, and I might be able to tell by where and when bogus tests are submitted whether or not I have a suspect computer. I could have one computer submitting nothing but junk, which I would obviously want to rectify.

Is there a complete general list I can browse, or a way to check my submissions?

Thanks!

No no no no no!!

dubious doesn't mean 'bogus' or 'bad' or 'error rate'. Please see the definitions above!

I may be able to show it via user ID sometime this weekend, but no guarantees..

**Matt** · 11-03-2005, 08:44 AM

Basically what you're saying is that if I do a test and return residue A and then someone else does the test again via secondpass and has an error and produces residue B then BOTH tests will be marked as dubious.

Therefore those people who have done more tests are more likely to have more dubious tests as there tests are more likely to be paired with other people's bad tests, i.e. they could have a lot of dubious even if all of their tests were perfect.

**vjs** · 11-03-2005, 09:22 AM

Originally posted by Alien88
I guess I have to say it again.. dubious and ERROR are *NOT* the same..

Yes I understand...

I'm not concerned about this:

Code:

   5202                       1268       391       30.84%

Since I've been running secondpass exclusively for a couple years

What I am concerned about is this:

Code:

129.x.x.x                  441       395       89.57%

dubious rate approaching 90% you can see my concern. Like E I'll continue my contributions as well at least until I get a nasty e-mail saying WHOA DUDE!!!

**Joh14vers6** · 11-03-2005, 11:16 AM

Originally posted by vjs
Yes I understand...

What I am concerned about is this:

Code:

129.x.x.x 441 395 89.57%

dubious rate approaching 90% you can see my concern. Like E I'll continue my contributions as well at least until I get a nasty e-mail saying WHOA DUDE!!!

It says still nothing and could be a coincidence.

**Greenbank** · 11-03-2005, 12:33 PM

Even if a machine is producing BOGUS tests 90% of the time it is still producing the correct result 10% of the time.

If the machine produces an incorrect test then it has to be tested somewhere else until we get a matching residue, no loss there.

So there is still a gain to be had from machines producing bogus tests doing second-pass. No matching result, someone else should pick it up and do it.

Obviously if a machine is producing lots of errors then I'd like to know if it was one of mine as I'd like to get it looked at and get broken CPU/memory replaced.

Machines producing definately BOGUS first-pass tests could be a problem but then that is why we do second-pass double checking. All machines are susceptible to random errors but it may be better if known dodgy machines are churning out first-pass rubbish.

All of my machines, including the P4 I'm sceptical of (hence my 21% DUBIOUS results) are doing second-pass so there should be no negative effect.

I am assuming two things:

1) A machine doesn't get assigned a test it had previously been assigned although this isn't a problem if:-
2) It doesn't produce the same incorrect residue if it does.

**vjs** · 11-03-2005, 01:24 PM

Thank-you Greenbank your obviously understanding what I'm saying,

The other possibility is that I've been doing secondpass for a very very very long time. Perhaps the results don't match b/c the orginal residue is non-existant or a different residue ???version??? from client 1.x.x etc.

I'm only worried its a borged box running the fubared client. I'm not terribly concerened since they are secondpass etc.

You point of running bogus boxes on secondpass is the way to go, yup. The possibility of matching bogus results is near zero and not worth considering project wise, I think this applies to E and wilden as well, are also in the secondpass boats.

**Joh14vers6** · 11-23-2005, 05:11 AM

Any new news on the errorrates now almost up to 5M is finished.

**Nuri** · 11-30-2005, 07:15 AM

What does everyone think about the need for availability of a PRP results file?

I guess there would be some people willing to do some cruncing and come up with useful ideas.

What info should it contain? The more the better, of course. I feel like it should contain

k
n
client version
residue
user id
IP address
date assigned
date returned

data at the minimum.

Probably something like, say monthly updates (like first day of each month) or so on the file would be sufficient to provide us with sufficient data on if anything goes wrong or not, and it will provide sufficient time to project coordinators in case of a new prime.

So, what do you think? Any comments?

**hhh** · 11-30-2005, 08:41 AM

Originally posted by Nuri

k
n
client version
residue
user id
IP address
date assigned
date returned

I think with all this we get a data privacy conflict. The publication of IP adresses in particular I would consider as an offense.
Anonymisation should absolutely be done.

As for the publication of the residue, I see there no sense, as they are sort of pseudo random variables, and making them public would just mean to give the opportunity to misleaded folks to submit residues without test.

I think if ever there is somebody interested, for writing a thesis about Distrubted Computing or so, he can ask for the data and will get it, through personal interaction with the admins (so they know is name etc.).

As for our interest of crunching the data and find bad machines, make statistics ect.,
k
n
client version
the last two digits of the residue
user id (anonymized)
eventually IP address(anonymized)
date assigned
date returned

should be sufficient, if necessary. (!?!)
Though, I doubt there is so much interest making worth the work. But...
Yours H.

**maddog1** · 11-30-2005, 05:52 PM

hhh is right on with his comments!
Making public rerords of IP addresses and usernames linked to specific tests is a definite no-no. Many people, myself included, are feeling very strongly about privacy issues.
A negative too for making residues public, it will make it dead easy to forge results, for anyone wishing to wreak havoc on the project.

Some extra comments from me:
The IP address is mostly irrelevant if your ISP uses dynamic IP. For example, I have over 50 IPs on record at the site, they all correspond to 3 PCs only and there's no other way except looking at the logs of each machine to find out which one submitted a specific residue, all of them are connected using the same ISP and an ever-changing IP (within a range, but still...) It might of course be more useful for those of you using static IPs, but definitely not for extracting generic project-wide data. I have found out that my DSL even occasionally (on average every 1-2 days of continuous online presence) changes IP automatically while operating, so I can very well begin a test with one IP and finish it with a quite different one, making useful data extraction and linking of tests to specific machines difficult or impossible.

And perhaps making public only 2 digits of the residue is not accurate enough too, not too unlikely to be able to match 2 last digits by chance. Perhaps a good idea will be to make more public digits (say, last 4-5 digits) or present the residue in the form of D**2*C1***8*A**5, ie more than half of the digits replaced by an asterisk, whichever sounds more reasonable or easy to implement.

Finally, even a perfectly legitimate use like writing a thesis, would have to be considered very carefully by the admins, before they share any personal data like the IPs. No such agreement is made when you join the site, so anyone obliging to this would have a strong position against the team, not worth the risk IMHO, unless consent is given first by the users.

Perhaps, everyone should be able to access his personal data only, ie all the info Nuri posted about but only logged in through the personal preferences page.

I hope all these make sense

**vjs** · 11-30-2005, 06:04 PM

Agreed almost all around...

I don't think userID is private information you can easily match a persons userID with their account name and e-mail address (if you made it public) using the server currently.

On IP address... absolutely... never the whole address, however I think the last 3 digits would work just fine. (userID and the last 3 digits don't mean much)

Residues never never the full residue

On residues you only need the last 3 characters, the chances of the last 3 not matching are probably good enough for us. Thesis might be different story.

**Nuri** · 12-01-2005, 03:40 AM

Fair enough, thx for the comments.

**Nuri** · 01-02-2006, 07:46 AM

It seems like, we have at least two residues for more than 99% of tests below 5m and for more than 80% of the tests between 5m and 6m, as of now.

Alien88, would you consider an update on error rates?

**philmoore** · 02-27-2006, 01:20 AM

It seems to me that the error rate pretty much determines how far second pass should lag behind first pass. Suppose the error rate is e, and current second pass exponents are a fraction f of first pass exponents, so for example, if f=.6, we would be doing second pass at 6M simultaneously with doing first pass at 10 M. The expected amount of work to test all exponents up to n is approximately proportional to n^3. The expected amount of work to test all exponents up to n would then be:

(1-e)(n^3 + (fn)^3) + e((n/f)^3 + n^3) = n^3(1+ (1-e)f^3 + e/f^3).

Minimizing this function for fixed f then leads to f = (e/(1-e))^(1/6).

If e = .08, this gives an optimal f of about .665, so one would expect second pass to be at exponents about 2/3 the size of first pass exponents. For e=.015, f should be closer to .5, and this is about the level of the GIMPS double-checking effort. By time e reaches .25, f should be about .794, and for e equal to .50 (a 50% error rate), the optimal f rises to 1.

**vjs** · 02-27-2006, 09:33 AM

Looks good Phil the only difference between Mersenne and SoB however is that we don't really need to test every n. In other words we just have to find a prime k/n pair.

Not discounting the accuracy of the above equation but there are a few minor things that also need to be considered.

Since we stop once we find a prime for a particular k, this leads to is lost work from doing secondpass testing when a firstpass test yeilds a prime and vice versa. Not sure how this effects your equation.

Second lower n tests are more likely to be prime than firstpass tests due to prime denisty. This could be easily factored into your equation.

Also we have a fairly substaintial sieve and factoring effort which need to be considered.

I'm not discounting your equation, I'm sure the above considerations may only effect the, f, by a few percent. I may graph your function of e vs f later this week and post it here, nice work and thanks. I just though you may wish to optimize it further with prime denisty considerations?

**philmoore** · 02-27-2006, 01:51 PM

Thanks, your suggestion of factoring in prime densities is an excellent one. As for the "wasted work" issue, that is what I was trying to model with the equation. I made the assumption that there was a prime yielding exponent at the value n and tried to model the amount of work it would take to find it as proportional to:

(1-e)(n^3 + (fn)^3) + e((n/f)^3 + n^3)

(1-e) is the probability that the exponent is found in first pass, e is the probability that it is found in second pass. If the exponent is found in first pass, (fn)^3 represents the amount of work "wasted" in doing second pass tests, while if the exponent is found in second pass, (n/f)^3 represents the amount of work "wasted" in doing first pass tests. (Of course, I am leaving out a proportionality constant in all this.) But certainly the possibility that there is a second prime yielding exponent just a little bit higher is one I haven't taken into account.

One thing I like about this model is that it allows changing f in response to change in e. For example, e could be effectively lowered through identifying tests with errors, or tests from machines which have proven less reliable in the past, and reissuing those test in the first-time queue. Lower e would then lead to a lower f which would then result in more emphasis on first pass. If f = 1/2, the work done in double-checking only represents 1/8 of the work done doing first time tests, so double-checking represents about 11% of the total effort. If f = 2/3, then the work done in double checking represents 8/27 of the first time work, or 8/35 of the total, about 23% of the total effort.

Thanks for your thoughtful comments!

**vjs** · 02-27-2006, 03:32 PM

Phil,

Thank-you for the more detailed explaination, your equation is making alot more sence to me now. Personally I wouldn't worry about the possibilities of two primes, i.e. a firstpass missing the first occurance followed by finding a second occurance before secondpass reveals the first. (Phew... hope that makes sence).

I think the likelyhood of something like this would be rare as long as secondpass doesn't get too low.

Reflections on factoring:
I don't think factoring, P-1, makes any difference to your equation since P-1 is generally done before firstpass testing. Also P-1 factoring shouldn't be done below the firstpass level or "submitted late" anyways.

Sieve on the other hand does make a difference since there is a window where sieve will eliminate a test once it is firstpassed.

In any case, sieve, prime density, etc I believe their contribution is less than the error in error rate determination.

Graphing your function... a interesting graph y is f (secondpass n divided by firstpass n)
x is the error rate e, (0.01 is a 1% error rate).

I think this graph aids your explaination from before, and your points.

- Secondpass n half the size of a firstpass n is only good for about 1.5% error rate.
- A more reasonable (possibly over-estimate) of the error rate 10% would require a secondpass level 70%. (Which is about the current second vs first pass level)

Well done Phil, gives us alot of food for though.

By the time we reach n=12M for firstpass, secondpass should be 8M but no more than 8.5M.

**jasong** · 02-27-2006, 05:41 PM

For those of you interested in calculating the value of test in relation to other things(like sieve, p-1 and secondpass) I had a thread in Riesel Sieve. Unfortunately, the sub-forum it's in seems to have disappeared(another sub-forum did this a while back, it was an accident). If the sub-forum comes back I'll put a link here.

Word of advice: When I post the thread, readers would probably get more out of the experiece by reading the posts in reverse order. I made some screwups in the beginning, but I believe I got it right in the end.

edit: Here it is.

**Keroberts1** · 02-27-2006, 09:50 PM

i believe yo uare making it unnescessarily complex. All you need to d ois determine which is most likely to produce a prime at anygiven level. for second pass this is exactly the (error-rate)(likelyhood of prime)(percent of ammount of time it takes for fulltest compared to firstpass). Make this value equal to firstpass likelyhood of a prime and yo uhave the correct level. Of course these are non-static values which i am incapable of determineing. This does to me seem the most apropriate way of starting however. Test density as effected by P-1 adn sieveing effects the likelyhood of a test returning a prime for all tests in the sieved range therefor it is unnecessary to include i nthe overall equation. Only in a small part of it which has been done before in other threads. I believe all of these values have been deterimined before and should be relativly easy to find in the threads.

**philmoore** · 02-28-2006, 12:38 AM

I tried to incorporate the suggestions of vfs and am trying to understand why I am now getting a different answer. Not that I wouldn't expect a different answer, but I am getting a smaller f instead of a larger one, whereas, I would think that taking into account the fact that a small n test is more likely to lead to a prime, I was expecting a larger f.

The assumptions of the first model were that there was a prime k*2^n+1 at a certain n, and I was trying to minimize the expected amount of time it would take for this prime to show up either at first or second pass.

Now I am just assuming that a test of k*2^n+1 takes time proportional to n^2*ln n, and that the probability of this number being prime is proportional to 1/ln(k*2^n+1), or 1/n, with a proportionality constant which depends upon how far factors have been sieved for the candidate. Therefore, I am assuming in this model that first and second pass candidates have been sieved to the same general overall level.

Therefore, the probability per unit time of finding this candidate k*2^n+1 prime is proportional to 1/(n^3*ln n). Overall, the ln n factor makes little difference in finding the optimal strategy.

So suppose e is the proportion of first pass tests which are in error, and suppose that second pass lags behind first pass by a factor of f < 1. The probability per unit time that a first pass test finds a prime is then approximately proportional to (1-e)/n^3 where n is the current range of testing. If second pass tests are taking place on exponents of the size f*n, the probability per unit of time of a second pass test finding a prime is then proportional to e/(fn)^3. Equating the two, (and ignoring the ln n factor), we see that the optimal size of f, where first and second pass tests are equally likely, per unit of time, to find a prime, we must have:

f = (e/(e-1))^(1/3)

Note the exponent of 1/3 instead of 1/6 ! This analysis says that the optimal f is the square of the f in my original model, and therefore, smaller, i.e., for e=.06, f is .45 instead of .66, for e=.015, f is .25 instead of .5, leading to much lower second pass ranges. The paradox is that we are now taking into account that smaller n's are more likely to yield primes, which one would think would increase the importance of second-pass tests, rather than decrease them. On the other hand, because we would now be happy with a larger n prime if the smallest n happened to have been missed in first pass, that seems to argue for emphasizing first pass more.

Am I missing anything here? I'd appreciate it if anyone notices something I have overlooked. Thanks again for your comments.

Two more points: I don't think P-1 changes anything, as P-1 is just a cost effective way of eliminating some tests. In addition, these primes are rare, so in most cases, we would expect if a first pass prime were missed, nothing would be discovered until second pass catches up to it. So why do the two models make such a different prediction about how far behind first pass would be optimal second pass tests?

Thread: Error Rates Thus Far

Thread Tools

Rate This Thread

Display

Posting Permissions