Server down?

**jMcCranie** · 05-19-2016, 09:54 PM

Originally Posted by AG5BPilot

We don't know which ones were double checked

Unlike the Mersenne prime search, we only need to double check positive results. The time double-checking is better spent checking new numbers. For Mersenne primes, we want a complete list. For 17-or-bust, we only need to find a prime for each coefficient. If we get a false negative, no harm is done if we find a prime for that coefficient.

**endless mike** · 05-20-2016, 04:03 AM

Originally Posted by jMcCranie

Unlike the Mersenne prime search, we only need to double check positive results. The time double-checking is better spent checking new numbers. For Mersenne primes, we want a complete list. For 17-or-bust, we only need to find a prime for each coefficient. If we get a false negative, no harm is done if we find a prime for that coefficient.

A false negative would mean a missed prime. Potentially wasted years of computing would count as harm in my book. That's the main reason I gave up on SOB and went back to GIMPS.

Originally Posted by AG5BPilot

Consider a hypothetical k where the first prime is at n=100,000, and the second prime is at n=100,000,000. If you miss the first prime because of an undetected computation error, many years of unnecessary computing will be wasted searching for the second prime.

**jMcCranie** · 05-20-2016, 10:30 PM

Originally Posted by endless mike

A false negative would mean a missed prime. Potentially wasted years of computing would count as harm in my book. That's the main reason I gave up on SOB and went back to GIMPS.

Yes, but the purpose of SOB is to try to prove the Sierpenski conjecture. Large primes are very rare and a false positive is extremely rare. Double-checking in SOB cuts the throughput down by half. In other words, double-checking essentially doubles the expected computing that has to be done to prove the conjecture.

**endless mike** · 05-21-2016, 12:22 AM

Originally Posted by jMcCranie

Yes, but the purpose of SOB is to try to prove the Sierpenski conjecture. Large primes are very rare and a false positive is extremely rare. Double-checking in SOB cuts the throughput down by half. In other words, double-checking essentially doubles the expected computing that has to be done to prove the conjecture.

Consider the highly unlikely possibility that there's only one prime for a given k. A false negative with no double checking means we crunch that k forever and never prove the conjecture. Unlikely to happen that way, but not impossible. People more in the know claim an error rate of about 4% (IIRC) on GIMPS. On the PrimeGrid message board, someone mentioned that a SOB work unit had to be sent out on average of 4.7 times to get a matching doublecheck. That post is three years old, but I can't image the situation is much different now. I still think double checking is valuable.

**AG5BPilot** · 05-21-2016, 08:06 AM

Originally Posted by endless mike

On the PrimeGrid message board, someone mentioned that a SOB work unit had to be sent out on average of 4.7 times to get a matching doublecheck. That post is three years old, but I can't image the situation is much different now. I still think double checking is valuable.

While I'm solidly in the "double checking is a necessity camp" (and I'm one of the people making the decisions), let me correct that "4.7" statistic. While it's true that some of our sub-projects require a lot of tasks to be sent out in order to get two matching results, it's not because the results don't match. It's because most of the tasks either don't get returned at all (or at least not by the deadline), or have some sort of error that prevents the result from completing. Bad residues are more common than I'd like, but they're not THAT common. Here's some hard data on our SoB tasks currently in the database:

SoB:
Completed workunits: 1089
Average number of tasks per WU (2 matching tasks are required, and 2 are sent out initially): 3.7805 tasks per workunit (4117 tasks total)
Number of tasks successfully returned but eventually proven to be incorrect: 61

As you can see, about 6% of the workunits had tasks that looked like they returned a correct result, but in fact didn't. These are SoB tasks -- the same as you run here. We use LLR, but it uses the same gwnum library as you do here, so the error rates are going to be comparable. LLR has lots of internal consistency checks, so many computation errors are caught and not even returned to us. That's just the ones that slipped through all the checks and made it to the end.

At PrimeGrid we detect the errors, so the user gets an immediate indication that's something's wrong. On projects that don't double check, the users never know there's a problem, so the error rate might be higher.

The numbers are worse on GPU calculations. It's much harder to get GPUs to work at all, resulting in many tasks which fail immediately. On our GFN (n=22) tasks, which are GIMPS-sized numbers:

GFN-22:
Completed WUs: 2217
Tasks: 17996 (about 8 tasks per WU)
Completed but incorrect tasks: 85 (about 4%)

Some of those tasks are CPU tasks, but the vast majority are GPU tasks.

So there's your hard data: On the long tasks (SoB on CPU, GFN-22 on GPU), about 6% of workunits had seemingly correct results from CPUs which turned out to be wrong, and about 4% of the workunits had GPU tasks which were wrong.

(Frankly I'm surprised that the CPU error rate is higher than the GPU error rate.)

**engracio** · 05-21-2016, 05:18 PM

A while back we ran a double check up to 30M. After most wu units were returned 22699K only had 1 wu needed to be submitted. The higher k had less than 10 wu each to be return, Unfortunately I am not sure if the results were matched with the previous results. Mike any Idea??

**AG5BPilot** · 05-21-2016, 05:37 PM

Originally Posted by engracio

A while back we ran a double check up to 30M. After most wu units were returned 22699K only had 1 wu needed to be submitted. The higher k had less than 10 wu each to be return, Unfortunately I am not sure if the results were matched with the previous results. Mike any Idea??

None at all.

**jMcCranie** · 05-21-2016, 10:33 PM

Originally Posted by AG5BPilot

SoB:
Completed workunits: 1089

What is a "workunit"?

**AG5BPilot** · 05-22-2016, 12:50 AM

Originally Posted by jMcCranie

What is a "workunit"?

For the purposes of this discussion, "a candidate", .i.e., a number to be tested, is a reasonable definition.

**jMcCranie** · 05-21-2016, 10:32 PM

Originally Posted by endless mike

Consider the highly unlikely possibility that there's only one prime for a given k. A false negative with no double checking means we crunch that k forever and never prove the conjecture. Unlikely to happen that way, but not impossible...

OK, so how about: no double checking to try to resolve the conjecture nearly twice as quickly, but if and when it gets down to only one k with unknown status, run a double check on those.

----Added----

I'll make an analogy. Suppose that there are a large number of boxes. A small number of boxes contain a diamond and you want to find diamonds. The first time you look in a specific box, if it contains a diamond, there is a 5% chance that you will not see it.

Should you (1) spend half of your time double-checking boxes you have already opened, or (2) open as many boxes as you can? I would open as many boxes as I can.

**Joe O** · 05-23-2016, 08:37 AM

Originally Posted by jMcCranie

OK, so how about: no double checking to try to resolve the conjecture nearly twice as quickly, but if and when it gets down to only one k with unknown status, run a double check on those.

----Added----

I'll make an analogy. Suppose that there are a large number of boxes. A small number of boxes contain a diamond and you want to find diamonds. The first time you look in a specific box, if it contains a diamond, there is a 5% chance that you will not see it.

Should you (1) spend half of your time double-checking boxes you have already opened, or (2) open as many boxes as you can? I would open as many boxes as I can.

It is important to note that the boxes are numbered, and
1) The lower numbered boxes are more likely to contain a diamond than the higher numbered boxes.
2) The higher numbered boxes are harder to open than the lower numbered boxes.

**AG5BPilot** · 05-23-2016, 11:14 AM

Originally Posted by Joe O

It is important to note that the boxes are numbered, and
1) The lower numbered boxes are more likely to contain a diamond than the higher numbered boxes.
2) The higher numbered boxes are harder to open than the lower numbered boxes.

Taking that a bit further, the difficulty of opening the boxes is proportional to the square of the box number, and the overall chance of finding a diamond (taking into account how hard it is to open the box as well as the likelihood of a given box containing a diamond) is inversely proportional approximately to the cube of the box number times the logarithm of the box number. Diamonds in higher numbered boxes are much harder to find. You really don't want to miss the easy ones, ever.

The allure of progressing twice as fast is obvious, but the penalty for missing a prime is tremendous.

**jMcCranie** · 06-01-2016, 08:17 PM

Is there any word on the SeventeenOrBust project?

**AG5BPilot** · 06-01-2016, 09:53 PM

Originally Posted by jMcCranie

Is there any word on the SeventeenOrBust project?

Nothing new to report.

Rest assured that someone (probably me, unless Louie jumps in) will let you know any information as soon as we know anything.

If I were Louie, I wouldn't give up until every last possibility was tried. And that may take a while.

Thread: Server down?

Thread Tools

Rate This Thread

Display

Hybrid View

Posting Permissions