V1.28 GUI
Max of 75Kp/Sec on an very idle p3-600 (67kp/Sec when i'm using it normally)
Maybe you have a lot of sent items. Try removing some of those.Originally posted by paul.jobling
I think my PM inbox is broken - it is empty but people say they cannot send me messages as it is full...
PIV 1.5 goes from ~55kp/s (1.27) to ~60kp/s. Very nice .
V1.28 GUI
Max of 75Kp/Sec on an very idle p3-600 (67kp/Sec when i'm using it normally)
Version 1.28 is almost at the pace of the (faulty) 1.26 for me.
Still some slight cosmetic error: The help page still states Jan 20th.
If half a year ago I had told someone I achieve >100K p /sec with a CPU <1 GHz for all 12 k's, I'd be put into a nuthouse.
PIII 1 ghz from 104 to 116 with v 1.28...very nice indeed
Slatz
I think is better to use Pentium 4 for PRP testing and Pentium III for sieving.
When sieving, my Tualatin@1420 MHz is almost 3X times faster than ceselb's Pentium 4 1500, and I'm sure in PRP testing the Pentium 4 is much faster than the Pentium III.
That's true in the sense that P4 machines should be switched from sieving to prp testing earlier than others. But, at even 20T, P4 still performs much better at sieving wrt. prp testing.Originally posted by Troodon
I think is better to use Pentium 4 for PRP testing and Pentium III for sieving.
When sieving, my Tualatin@1420 MHz is almost 3X times faster than ceselb's Pentium 4 1500, and I'm sure in PRP testing the Pentium 4 is much faster than the Pentium III.
I think the decision on each machine should be based on it's unique performance at sieve vs. prp, not wrt. other machines.
Just a quick stat.
My P4-1700 tests a n=~3m test in 2 days. (0.5 test per day)
If finds 7 factors per day with sieving at 19,000G.
So, it removes roughly 14 times more candidates at sieving wrt. prp testing. (this figure was roughly 11 times a week ago, just before SoBSieve version 1.27 and 1.28)
It seems to me like, P4 boxes will still perform better at sieving wrt. prp testing even at 100T (yes, may be not 14 times anymore, but my guess is it will be at least 3-4 times).
I'd be happy to hear any other P4 user if you have different benchmark ratios.
PS: But, this does not change the fact that it's still psychologically depressing to run my P4-1700 at home at a sieve speed ~40% slower than my PIII-1000 at work.
Version 1.28 GUI 90999 p/sec 6.64 % increase on a PIII/500 under Win98SE - Low P range on the Double Check Sieve ProjectOriginally posted by Joe O
Version 1.25 GUI 80633 p/sec
Version 1.27 GUI 85333 p/sec 5.83 % increase on a PIII/500 under Win98SE.
Last edited by Joe O; 04-08-2003 at 09:42 PM.
Joe O
i recommend that everyone with a pentium4 run the SB client as opposed to sieving.
even though on a number vs number ratio, sieving appears better, it is far more beneficial to the project to find a new prime and eliminate all 50,000 possible remaining values for a given k.
pentium4's are incredibly efficient at doing prp tests. leave them on SB. in fact, i'd leave any 1GHz+ system on SB if at all possible. sieving is great, but in the end, SB has to find the primes.
-Louie
Thanks for the comment Louie. You have a good point there, which made me change my mind.
I'll switch my P4 to prp testing as soon as it finishes it's pending sieve ranges at DC sieve.
Code:Version 1<n<3M 3M<n<20M V125 Console 100000 40624 V127 Console 91083 45944 V127 Gui 93080 48744 V128 Console 107091 49985 V128 Gui 116803 54713
Windows NT on Celeron/450
P around 12800000000000
Looks like we have a new champion!
Joe O
What is the highest sieving speed someone achieves?
I present a Duron 1.3 GHz weighting in at 151Kp/sec. using SoBSieve 1.28.
Anyone beats me?
as you requested
200kp/sec on a P3-Celeron 1500
Here's a quick calculation that suggests it might still be premature. Using what I think are the best estimates avaiilable, the expected number of primes from n=3 million to n=4 million is 0.82. According to the latest Sieve Statistics, there are 36,320 candidates left in that range, so the probability of any single test finding a prime is about 1 in 44,000. Also based on the Sieving Report, there are about 616,000 candidates left above n=3 million - finding a prime would remove one twelfth of these, about 51,000. One chance in 44,000 of taking out 51,000 candidates is only 1.1 on average. So one PrP test will on average, remove 2.1 candidates. If the same effort Sieving removes 14 candidates, it's still more efficient to sieve.Originally posted by jjjjL
[B]i recommend that everyone with a pentium4 run the SB client as opposed to sieving. ... it is far more beneficial to the project to find a new prime and eliminate all 50,000 possible remaining values for a given k.
-Louie
Originally posted by Nuri
Just a quick stat. ... So, it removes roughly 14 times more candidates at sieving wrt. prp testing.
If I've misunderstood these numbers, please jump in with corrections.
I'll admit, though, that I'm not positive this is the right framework for answering the question. There's a pool of diverse resources to be allocated over time, not just at this instant. Some of those resources will be better suited for sieving always - should P4's be switched earlier to preserve sieving work for the slower machines?
Hi wblipp,Originally posted by wblipp
If I've misunderstood these numbers, please jump in with corrections.
There's a pool of diverse resources to be allocated over time, not just at this instant. Some of those resources will be better suited for sieving always - should P4's be switched earlier to preserve sieving work for the slower machines?
First, a small correction:
The server is currently distributing ranges at 3,430,000. So, there are roughly 36230*(1-0,43) = 20650 tests below n=4 million that was not yet distributed for prp testing.
Add the number of currently pending tests (2645), and substract the number of pending tests below n=3 million (44) to that number, so, there are roughly 23250 candidates for 3m<n<4m that we don't know the results yet.
Sieving report does not take into consideration the number of tests that have been already prp tested. The Remaining tests n > 3000000 stat at the SoB web site gives a better idea of the number of remaining candidates below n=20m that are not yet distributed by the server. It's updated every 30 minutes, and includes both the data for prp tests and sieve submissions.
Regarding the issue of using my P4 for prp testing, I agree with you that it might still be considered for sieving. One reason I changed my mind is that I want to have a chance (even if it's very little, that's still better than none) in finding a prime too. My other three PIII's at work are already doing roughly 85% of my sieving work, and they will continue to do so for at least for the following couple of months, provided that I'll keep my job.
And, a last note, yes P4's should be switched to prp testing earlier than other machines. Sieve speed at my PIII-1000's are roughly 60% faster than my P4-1700 (126k/sec vs. 67 k/sec), whereas P4 performs much better at prp speed.
Last edited by Nuri; 04-11-2003 at 02:57 PM.
Joe, thanks for the benchmarks. As far as I see, they are in line with my figures.Originally posted by Joe O
Looks like we have a new champion!Code:Version 1<n<3M 3M<n<20M V125 Console 100000 40624 V127 Console 91083 45944 V127 Gui 93080 48744 V128 Console 107091 49985 V128 Gui 116803 54713
One pleasant surprise that came with the new algorithm (apart from the speed increase) is that, the change in speed of the client is not proportional to the square root of the change in the size of range anymore (up to version 1.25, it was something like => DC sieve speed = Sieve Speed * (20/3^0.5) Alternatively, we could put 2.7 instead of 3 for the ratio after DC sieve SoB.dat changeover).
The proportion is less now after 1.27. That means, the extra benefit that might be gained from changeovers is less than it might with the earlier algorithms.
Last edited by Nuri; 04-11-2003 at 06:51 PM.
So i'm not the only one who noticed that P4's are unexpectedly slow with SOBSieve.And, a last note, yes P4's should be switched to prp testing earlier than other machines. Sieve speed at my PIII-1000's are roughly 60% faster than my P4-1700 (126k/sec vs. 67 k/sec), whereas P4 performs much better at prp speed.
Paul, can you explain why this is so? one would expect that a P4-1700 is at least faster then a P3-1000
Paul,
a small note on biquadratic residues.
If p = 3 (mod 4) then b is a biquadratic residue iff b is a QR.
Proof <=:
Assume primitive root g.
b is QR gives b = g^(2m) for some m.
Case 1, m even:
c = g^(m/2)
c^4 = b
Case 2, m odd:
g^(2m) = g^(2m+phi(p)) = g^(2m+4k+2)
4 | 2m+4k+2
c = g^((2m+phi(p))/4)
c^4 = b
Regards,
Mikael
afaik the P4's [regular, not sse2] fpu is horribly slow. I'm guessing Paul is using it to do modular multiplication, which is where most of the work goes. Of course, that's just a guess and I'd be most pleased to get confirmation on that... I'm having trouble writing a really fast mod_mul myself.Originally posted by smh
So i'm not the only one who noticed that P4's are unexpectedly slow with SOBSieve.
Paul, can you explain why this is so? one would expect that a P4-1700 is at least faster then a P3-1000
Regards,
Mikael
Thanks, NuriOriginally posted by Nuri
First, a small correction:
Changing the open range requires also changing the expected number of primes, so I don't think that changes the results. My error in the number of remaining tests appears to be a few percent, so that doesn't make much difference, either. Good Luck on finding a prime!
Mike touched an interesting point here. Why not combine the two sieving efforts (for those who like)?Originally posted by MikeH
Reserving
14000 - 14020 MikeH
(and yes this is correct - same range as in 3M<n<20M)
According to my calculations, this would save almost 14% computing power.
All we need is a new SoB.dat file with both ranges, right?
The thing is that my latest idea (t power residues) depends on the n values being sparse. Including lots of n below 3M would very likely be bad for performance.Originally posted by Mystwalker
Mike touched an interesting point here. Why not combine the two sieving efforts (for those who like)?
According to my calculations, this would save almost 14% computing power.
All we need is a new SoB.dat file with both ranges, right?
Regards,
Mikael
For information, I see a 12% reduction in speed using a 300k<n<20M sob.dat in place of a 3M<n<20M sob.dat. 12% is close to the performance gain we've seen with SobSieve recently, so I (kind of) see it as DC sieve comes for free.The thing is that my latest idea (t power residues) depends on the n values being sparse. Including lots of n below 3M would very likely be bad for performance.
Looked at another way, it's only a 55% reduction in using a 300K<n<3M dob.dat, and main sieving comes for free!
I'll probably move in the direction of using this alternative sob.dat. Currently I still use NbeGone_010 for most of my normal sieving, because the last time I compared NbeGone with SieveConsole the difference didn't seem to warrant changing. I have done some tests today, and I now find that SieveConsole 1.28 is ~60% faster than NbeGone on one of the P3s I use for the bulk of my sieving. Maybe the difference wasn't so big on AMD XP+ (which I use for testing). Anyhow, when I switch over, rather than a 60% increase, I'll get 50% plus free DC sieving.
For anyone else that's interested, I'll upload the alternative sob.dat, and give a link to it in the DC thread in a short while.
Mike.
I agree with you Mike. As I mentioned above, with the latest algorithm change, it makes more sense to sieve larger ranges now. However, I have two concerns:
With the alternative sob.dat file, there will be three different sob. dats. So, there is the danger of confusion, and more importantly, the problems that might arise from using a wrong sob.dat file is much more than that might have happened if we made a changeover (in the context that I previously discussed before version 1.28). We should be very careful publicizing the alternative sob.dat file.
Secondly, the number of factors found really drops significantly for n<3m at p~20T to something like 1 factor per 5G. On the other hand, at normal sieve project everything below 14T is already reserved. Also, I'm not really sure if it would make any sense to sieve n<2m for 20T<p. So, there is a window opportunity only.
Anyhow, if you will make the alternative sob.dat file available, I'm planning to reserve double ranges as you and Joe did, for p<20T.
That's an interesting way to look at it. A few ideas:Originally posted by wblipp
Here's a quick calculation that suggests it might still be premature. Using what I think are the best estimates avaiilable, the expected number of primes from n=3 million to n=4 million is 0.82. According to the latest Sieve Statistics, there are 36,320 candidates left in that range, so the probability of any single test finding a prime is about 1 in 44,000. Also based on the Sieving Report, there are about 616,000 candidates left above n=3 million - finding a prime would remove one twelfth of these, about 51,000. One chance in 44,000 of taking out 51,000 candidates is only 1.1 on average. So one PrP test will on average, remove 2.1 candidates. If the same effort Sieving removes 14 candidates, it's still more efficient to sieve.
I don't know how correct it is to discuss the efficiency of sieving 3M-20M vs the testing from 3M-4M. This model sorta assumes away a lot of the work we're doing in the sieving. Another way to look at it is that every number we sieved that is higher than where we find a prime is "wasted" since we didn't need to sieve that high. So if you estimates assume we're going to find a prime before 4M, they should also assume the fact that ~90% of the sieving for that k is worthless.
It will also be nice when the sieve speeds improve because we stop sieving a k. The speed of the sieve is still basically linear with the number of k's. And due to how spare the possible factors are in such large ranges (>10T) for these numbers, we may expect a slightly larger than proportional increase in speed since the kronecker/t'th power optimizations will "blank out" more p values for every k.
mklasson - you could probably comment on hard numbers for this kind of thing. do you know the proper way to calculate the increase in "blanking" based on each single prime being removed and leaving the other 11? it'd be cool to know in advance how each k might effect things before the next prime is found.
There are also a few intangibles that you may not be considering that come from finding a prime too. Increased user participation comes to mind. Sad to say, but no one at Anandtech is writing stories about how awesome our sieving is. "3rd largest prime discovered" makes a better headline than "about a dozen people run a program that involves numbers on their computers and stuff."
Another random thought I just remembered too is that the weight of each k is not constant, that it is more likely that when a prime is found, it will be found for a k value with the most n, not the least. so basically, to correct the equation i'd do something like the following:
change the expected # of primes from 0.82 --> ~6 (so it is comparing 3-20 vs 3-20 instead of 3-4), increase the # of eliminated tests from 50k-->70k. add in some sort of boost for permanantly doubling the speed of the rest of the factoring, and you'd have to include some good-will bonus for attracting new users and resources. I don't have time right now to calculate how that would effect the ratio between sieving and prp, but my ballpark estimate tells me that it is much higher than 10.
I'd still stick with my rule of thumb that says all machines over 1GHz should be running SB and sieve with the rest. Another way to look at it is that if your machine is running over 100k/s on the sieve, it'd probably be better used helping the find the next prime by runing SB.
-Louie
That's kind of tricky. If I used that as the rule of thumb, then I'd have to use my P4-1700 for sieve, and PIII-1000s for prp.Originally posted by jjjjL
Another way to look at it is that if your machine is running over 100k/s on the sieve, it'd probably be better used helping the find the next prime by runing SB.
-Louie
All other comments, I agree. Finding a prime of 1 million+ digits would definitely boost the project participation, which will become increasingly important as we go up in n values in our prp tests.
In the end, we've alredy done a pretty good job at sieving and cleared the way for prp testing pretty much. What we do from now on at sieving will still help the project, but not as much as up to now. Most of the remaining candidates will have to be prp tested anyway. If the formula I am using for factors vs p value is correct, then we should expect roughly 47,000 candidates more to be eliminated by sieving for 20T<p<200T (plus, roughly 10,000 up to 20T, both figures for 3m<n). 47,000 candidates means roughly 7-8% of the remaining candidates. So, more than 90% of the remaining candidates will be removed by prp testing.
To put it in other way, the first prime we will find will probably remove more candidates than sieving going forward.
Still, this does not mean we should stop sieving. Because, slower machines are still eliminating significantly more candidates by sieving (by a ratio much higher than x10). Also, sieving does not need the machine to be connected to the internet. Although it is a personal decision, I think the idea of using prp efficient machines (especially P4's) at prp testing, and prp inefficient machines (especially those are slower than 1GHz + those not connetted to the internet) at sieving makes sense at the p levels we are at.
I second that. My Tualatin needs more than 100 hours to complete a +3M PRP test, plus it's relatively more efficient at sieving (sieve depends less on memory bandwidth and cache size than PRP testing).Originally posted by Nuri
Although it is a personal decision, I think the idea of using prp efficient machines (especially P4's) at prp testing, and prp inefficient machines (especially those are slower than 1GHz + those not connetted to the internet) at sieving makes sense at the p levels we are at.
I think using P4's for sieving it's just a waste of energy.
Just wondering, is the weight of each K still an issue after sieving this deep?Another random thought I just remembered too is that the weight of each k is not constant, that it is more likely that when a prime is found, it will be found for a k value with the most n, not the least.
Disregarding factors below 5T (or 10T), is there still a significant difference in factors found for each K ?
Sieving upto a couple of T removed more candidates from K's with a low weight, thus leaving less candidates to be removed when sieving deeper compared to K's which had less candidates removed in the beginning of the sieving process
While I don't know how to calculate it theoretically, I can always run a few tests. Doing just that for a section of 3000 primes in the 11T neighbourhood gives you the following.Originally posted by jjjjL
mklasson - you could probably comment on hard numbers for this kind of thing. do you know the proper way to calculate the increase in "blanking" based on each single prime being removed and leaving the other 11? it'd be cool to know in advance how each k might effect things before the next prime is found.
Database used:
k: 4847. All remaining n = 15 (mod 24).
k: 5359. All remaining n = 6 (mod 8).
k: 10223. All remaining n = 5 (mod 12).
k: 19249. All remaining n = 14 (mod 36).
k: 21181. All remaining n = 20 (mod 24).
k: 22699. All remaining n = 46 (mod 72).
k: 24737. All remaining n = 7 (mod 24).
k: 27653. All remaining n = 33 (mod 36).
k: 28433. All remaining n = 1 (mod 24).
k: 33661. All remaining n = 0 (mod 24).
k: 55459. All remaining n = 10 (mod 12).
k: 67607. All remaining n = 3 (mod 8).
Database initialized. 624935 unfactored.
4847 has 61872 unfactored 3000000<=n<=20000000
5359 has 76825 unfactored 3000000<=n<=20000000
10223 has 71384 unfactored 3000000<=n<=20000000
19249 has 26086 unfactored 3000000<=n<=20000000
21181 has 61055 unfactored 3000000<=n<=20000000
22699 has 25069 unfactored 3000000<=n<=20000000
24737 has 64509 unfactored 3000000<=n<=20000000
27653 has 35943 unfactored 3000000<=n<=20000000
28433 has 34684 unfactored 3000000<=n<=20000000
33661 has 60512 unfactored 3000000<=n<=20000000
55459 has 85357 unfactored 3000000<=n<=20000000
67607 has 21639 unfactored 3000000<=n<=20000000
When all 12 k values are tested:
348 p values are discarded
When 4847 is skipped:
383 p values are discarded, 3% practical speedup
5359: 456 p, 19%
10223: 395 p, 13%
19249: 383 p, 1%
21181: 380 p, 9%
22699: 379 p, 7%
24737: 385 p, 9%
27653: 392 p, 9%
28433: 383 p, 4%
33661: 383 p, 10%
55459: 391 p, 13%
67607: 431 p, 23%
With that said, I must add that this is with my siever. Sobsieve's numbers ought to be similar though. These results are without using the order(2,p) idea. I get a small performance increase from using it in my siever, but judging from Pauls comments that doesn't seem to be the case for sobsieve.
Sooo, let's hope we kill off 5359 or 67607 next. An idea is of course to focus on prp-testing solely 67607 until either a prime is found or all n values less than 20M are eliminated. Considering how "few" n values there are left below 20M for 67607, this may actually be a good idea. Then we could take care of the other k values, with that juicy 23%-ish (assuming that's valid for sobsieve as well) speedup to go... What do you think?
Oh, and Louie, while I have you on the line... is there any way I could find out exactly which n candidates are left? I'm reading /sieve/results.txt right now, but it'd be good to see the n values eliminated by prp tests as well.
Regards,
Mikael
I think this amounts to saying that flexible resources should be allocated in the manner that minimizes the time until finding the next prime. The previous calculation was based on the premise that flexible resources should be allocated in the manner that most rapidly removes candidates from the sieving report. In both cases there are many resources that will ignore our discussion and continue to do whatever they do - we are only concerned with the effect of moving a small number of computers.Originally posted by jjjjL
I don't know how correct it is to discuss the efficiency of sieving 3M-20M vs the testing from 3M-4M ... There are also a few intangibles that you may not be considering that come from finding a prime too. Increased user participation comes to mind.
-Louie
It's possible that sieving is still best - If the sieve removes 2 candidates that are higher than the last assigned value but lower than the next prime, then sieving would be more effective. If on average sieving removed less than 1 from this range, then prime testing is the quickest way to the next prime. If the next prime is already assigned, then it doesn't matter what we do.
We can model this. Use the Proth Weights and Poisson model to calculate the probability of no primes between 3.4 million and "x", and take the negative inverse to get the probability density the next prime is "at x." Fit a cubic spline to the total number of candidates remaining, and use this to calculate the fraction of candidates between 3.4 million and x - this is the probability an eliminated candidate is in the right range to speed up the next prime. Multiply by 14 because we can detect 14 candidates through sieving in the same time we can do on prime test. Numerically integrate over all values of x.
The reullt is that on average, the 14 sieving results will remove 1.15 that come before the next prime. Technically this says that sieving is a tiny bit more effective than prime testing. Practically, it says they in a dead heat for effectiveness in getting to the next prime.
I have tried sobsieve 1.28 in Windows 98 SE and the console version is ~10000 p/s slower than in Windows 2000 but the GUI version is ~21000 p/s faster (159000 p/s versus 180000 p/s). The buggy 1.26 GUI does 190000 p/s!
I have been thinking a lot about biquadratic residues recently, as I know that a faster implementation will speed up the program slightly. Ditto for octic and sextic. I thought that I had something last week and saw a dramatic increase in speed - infortunately it turned out to be a bug However, should I manage to get a speed up I will let you know.Originally posted by mklasson
[B]Paul,
a small note on biquadratic residues.
If p = 3 (mod 4) then b is a biquadratic residue iff b is a QR.
It turns out that for the values of k that we are interested in, we only perform a biquadratic test when p=1 mod 4. Think about gcd(p-1,4): it is 2 when p=3 mod 4.
However, I know that if I can find a fast test for biquadratic residues (using bit-twiddling, in the same way as the Kronecker symbol can be evaluated) I can make the code a little faster.
Regards,
Paul.
PS Last Wednesday night I was scribbling away and my partner asked what I was doing, so I replied to her that it was to do with "biquadratic reciprocity". You don't get to say that very often!
Last edited by paul.jobling; 04-14-2003 at 06:22 AM.
EDIT: I was seeing an artifact. Ignore!
Last edited by paul.jobling; 04-14-2003 at 06:15 AM.
Yes, the P4 is not too hot when it is given 'normal' integer and FPU computations rather than SSE2-optimised code. And Mikael says, most of the work in SoBSieve is modular multiplication, and to do that efficiently SSE2-specific code should be used (which I do have in NewPGen, but I have not put it in SoBSieve yet).Originally posted by mklasson
afaik the P4's [regular, not sse2] fpu is horribly slow. I'm guessing Paul is using it to do modular multiplication, which is where most of the work goes. Of course, that's just a guess and I'd be most pleased to get confirmation on that... I'm having trouble writing a really fast mod_mul myself.
Regards,
Mikael
Regards,
Paul.
hi, just looked through some of my sieving results. I have found one factor:
11817166171171 | 22699*2^3317518+1
where the exponent is below the largest exponent tested.
There are a few others (in the 3,500,000's) which are getting close too.
My interval should be complete within 10 days, which wondering if I should report some of the factors now.
Alan
Depends on the answer to the question raised in this thread:My interval should be complete within 10 days, which wondering if I should report some of the factors now.
http://www.free-dc.org/forum/showthr...&threadid=2870
In every case it's advisable to submit factors close to the maximum of the current test window immediately, as there is no need to do a PRP test on that k/n pair.
If tests gets deleted ASAP, then it would be better to report factors within the current test window, too. For the individual computing that test at that time it might be unsatisfying to have the test aborted, but maybe he can be credited for the full test anyway.
I have been sending immediately the factors found for 3M<n<4M for the reason Mystwalker has said.
Doh...Originally posted by paul.jobling
It turns out that for the values of k that we are interested in, we only perform a biquadratic test when p=1 mod 4. Think about gcd(p-1,4): it is 2 when p=3 mod 4.
Regards,
Mikael
I usually report factors whose n's are max. ~100,000 bigger than the current test window immediately - so now it would be all below a little over 3.5M...
This is what WinTop reports for V1.28 after running 4 days:
Joe O
Along the same lines of "why don't we increase the base n above 3M to increase the sieve speed", here's another slightly more controversial idea.
Why not eliminate the three k with the lowest densities of remaining factors (19249, 22699, 67607) from the sieve? Each of these k have about 1500 candidates remaining for each band of 1M n. Compare this with k=55459, where there are about 5000 candidates per 1M n.
Removing these three k would increase the sieve by an apparent 25% (measured), but because there would then be fewer candidates, 12% fewer candidates to be precise, the actual increase in removal of candidates would therefore reduced to 10%.
125% * 0.88 = 110%
Don't think for one second that I am suggesting we do this. After all I am one of a small number promoting the 300K-20M sob.dat file. Just thought it worth sharing.