in this thread.
1.28 203kp/s
1.33 254kp/s
on a P3-Celeron(Tualatin) with 1500 MHz, 128 MB SDRAM, alpha=2.5n=3673052 k=21181
cEMs/sec=155.000 time left (for full test)=91hours 5minutes
Troodon: Ah, ok!
in this thread.
1.28 203kp/s
1.33 254kp/s
on a P3-Celeron(Tualatin) with 1500 MHz, 128 MB SDRAM, alpha=2.5n=3673052 k=21181
cEMs/sec=155.000 time left (for full test)=91hours 5minutes
Joe O
priwo uses about 5.7 Ghz days at n=3.5M[on a P3-Celeron(Tualatin) with 1500 MHz, 128 MB SDRAM,
n=3673052 k=21181
cEMs/sec=155.000 time left (for full test)=91hours 5minutes
MikeH uses about 4.7 Ghz days at n=3.5MThis is the log from the P3-850 quoted above (it's nothing special a Dell latitude laptop). As you can see ~46 minutes per block, 173 blocks for this test (n=3504985), so that's just over 5.5 days.
Lagardo uses about 3.5 Ghz days at n=3.5M[B}For n=3.5M (that's about 172 blocks or) just under 54 hours for the whole test. Or scaled to 1GHz about 3days12.75hour for a full test on a 1GHz Tbred.[/B]
Most of the variation is probably memory speed. 5Ghz days looks like a conservative round number averaged over machines.
The numbers for the 31337 tests around 13.5M are coming in less than our scaling model would suggest, although this may be related to break points in the FFT scaling. Our scaling model suggests a a factor of {[13.5*10^6*ln(13.5*10^6)]/(3.5*10^6*ln(3.5*10^6)]}^2 = 17.7, so an elite test should take 88 Ghz days.
54.5 Ghz daysOriginally posted by OberonBob
ok, on my PII 350 here is what I am seeing:
~ 152000 cEMs/sec
155.8 days
figuring this as 1.5 Ghz , it's 58 Ghz days.Originally posted by OberonBob
alrighty, I am running an AMD 1900+ here is what I am seeing.
k: 21181
n: 13466972
cEMs/sec: ~680000
# of blocks: 9773
average block time: 5:42
est. time to complete: 3342366 secs or 38.7 days
The discrepancy may indicate a problem in our scaling estimate, or it may be variation caused by different processors and lumpiness in the FFT size transitions.
AMD Athlon 1900+s runs at 1600MHz, fyi.
I've found the problem: the In-order queue depth option was set to 1. If I set it to 8, it works ~2X faster!
Now, I think good estimate would be 5 or 6 GHz days.
So, wblipp, what's the result?
Sorry if you have already mentioned it and I could not understand. If so, could you please rephrase it? If not, could you please inform us on what other figures you need?
Thx in advance.
The major results are that within wide bounds the sieving level doesn’t matter much and that our model has failed to capture an important aspect of the system. More on this failure and what I'm going to do about it, later.Originally posted by Nuri
So, wblipp, what's the result?
I updated some of the data and started working through questions. The simplest questions went fine.
For the update, we will soon be testing n=3.8 million, and we have sieved through about 23*10^12. Assuming no prime found by 3.8 million, the 50th percentile points for the next three primes are increased to 4.85 million, 7.10 million, and 16.5 million.
The simplest question is "How long will it take assuming we continue with 2% of resources dedicated to sieving? The answer is that will reach the 50th percentile milestones in 255, 784, and 2625 days.
The next simple question is "If we continue with a fixed percentage of resources dedicated to sieving, what is the optimal fixed percentage?" The answer is that 2.7% of resources would make these numbers 256, 783, and 2624.
The next simple question is where the problems show up. It seems intuitive to many of us that as sieving proceeds to higher levels it becomes less effective, so the resources devoted to sieving should decline over time. One way to achieve that would be to have the percentage of resources devoted to sieving set at some level today, and to decline as the sieving progresses, perhaps as a negative power of the sieving level. The next simple question is “If the percentage of sieving resources is a function of the form a/(SieveLevel)^b, what values of a and b are optimal?” Functions of this form can be greater than 100% in the early times, so we must interpret these values as 100% sieving. Following the search path, we soon notice that something strange is going on. We soon move into the range where the sieving starts at 100%, but we keep finding slightly better solutions at endlessly higher and higher powers. After a while it becomes necessary to stop mindlessly cranking numbers and actually think about what is going on.
Here’s what is going on. The question we are really asking the model is “What is the fastest way to get to n=16.5 million?” We know the third prime isn’t really going to be at exactly 16.5 million – this is just the midpoint of a huge range of numbers where it might show up. But it seemed like a good proxy for the real question, and for the earlier primes it gave us an easy way to adjust the model for the changes that happen when we find primes. But think for a moment about the best way to get to any fixed value of n. You would sieve some and test some. One thing that would slow you down would be to test a number that is later removed by sieving, so you would want to avoid that. To avoid that loss of efficiency, you would want to do all the sieving first, and then follow it by all the testing. That's what the functions are trying to achieve! At any power, the sieving is 100% for a while and then drops off rapidly. As we progress to higher and higher powers, the drop off comes faster and faster, getting closer and closer to this optimal way to reach a fixed number. The limit is that the fastest way to reach n=16.5 million is sieve through 5.8*10^15 then test the remaining n values. This approach will reach the three milestones in 474, 880, and 2609 days.
The first thing to observe is that within wide bounds the level of sieving doesn’t make much difference. Most of the work is going to be testing, and it’s going to take as long as it takes. Optimizing our sieving for reaching n=16.5 million will save 15 days out of 7 years.
I think the problem with the model is that it fails to capture the situation that the next prime may come very soon or it may be a long ways off. We want some resources testing in case it comes soon, and we want some resources sieving in case it is far off. The fixed milestone approach loses that balancing of uncertainty. I’m going to try changing the model by focusing on the next prime, and find the 10 mid-decile values – the values by which there is a 5% chance of finding the next prime, 15%, 25% … 95%. Then find the times to reach these 10 values and ask about the average of these times and how it changes with sieving.
wblipp, thanks for the response.
5.8*10^15 :shocked:
That's huge... 5,800T (or 5,800,000G). If I recall correctly, the largest figure pronounced so far at this forum was 400T. At levels that high, we would probably be finding a prime per 177G on the average (or roughly one every two weeks, may change depending on the machine speed of course). I'm not sure, but I guess that's also the time to prp test a candidate at n=16.5m.
Just some brainstorming. I'd be glad if you could respond to the questions below.
A question: Does your model consider the distinction between factors found above/below effective prp n level? (For ex: as of today, we're at=3.8m, so effectively, roughly 95% of the unique factors found today are helping to reduce the number of candidates we will be testing in the future, and the remaining 5% eliminates the candidates that were already tested. The same percentages would be 20% to 80% at n=16.5, so if we were not concerned about double checking prp test runs at all, the limit for sieve dept would be where it finds unique factors at least five times faster than a prp test for 16.5m could be finished).
A second question: Does the model adjust for sieve client speed increase due to removal of one of the k values when a prime is found? (and of course, the decrease in the number of unique factors found per sieve range)
Anyway, I am curious about something else too. I guess your model can solve for this:
Let's assume that we are aiming to reach p=500T at sieve when n reaches 18m at PRP. Let's also assume that we have the ability to keep a fixed percentage of our resources at sieve.
If so, what would that fixed percentage be, and when would we be reaching that?
PS: 500T and 18m are arbitrary. I just assumed that 500T would be deep enough, and we would want to shift our sieving effort to 20m+ ranges as we approach 20m at prp. I'd be very happy if you could also look at some sensitivities around those two numbers.
Regards,
Nuri
The model knows that only sieved values above the current test point increase the progress of testing. First the sieving differential equation determines the sieving level at time t. After that has been solved, the sieving level enters the differential equation for the progress of "n" via Merten's Theorem, which tells us how many "n's" we will increase for every test performed.Originally posted by Nuri
A question: Does your model consider the distinction between factors found above/below effective prp n level?
Yes. At each milestone the sieving gets faster because there are fewer k values to sieve and the testing gets faster because there are fewer k values to test.A second question: Does the model adjust for sieve client speed increase due to removal of one of the k values when a prime is found? (and of course, the decrease in the number of unique factors found per sieve range)
Perhaps I should have written a summary for the model, but it's all in this thread someplace, often with several revisions as people pointed out errors and improvements.
That should be easy, but it may be a few days until I get to it.Let's assume that we are aiming to reach p=500T at sieve when n reaches 18m at PRP. Let's also assume that we have the ability to keep a fixed percentage of our resources at sieve.
If so, what would that fixed percentage be, and when would we be reaching that?
i think the underlying assumptions need to be reanalyzed when the model recommends delaying the discovery of the next prime by 100% (250 --> 500 days) in order to save two weeks over 7 years in the discovery of prime 8. this model ignores what we discussed earlier about how user growth is non-existant or neg in long periods without primes and huge afterwards.
i think the model shows two things:
1) making this model is hard
2) whether we stop sieving now or sieve another 5000T, we won't dramatically change the rate at which primes are found.
this is both good and bad. it means that all sieving is worth doing, but also means that all prping is worth doing too. the majority of the work needs to be done with SB and I think we have a good balance of resources right now.
i think looking short-term may make analyzing it easier. if our goal was only to find the next prime and to ignore double-checking for the moment, that would mean that to compare sieve rates to prp rates, you'd compare the (weighted) average time it takes your computer to do a test between where we are and the next prime, and the number of candidates you can remove by sieving between now and the next prime.
since "the next prime" is an unknown value, lets use the prediction n=4.8M. so we need to check all numbers between 3.8-4.8M before we have a better than average chance of "next prime." remembering that about 20% (not positive on that figure, someone could calculate it more accurately) of the results are for numbers that already have a factor, that means that you should compare your prp rate to 80% of 1/17th (or 1/20th for dual) of your sieve production. So figure out how large a range you can sieve in the time you can do one prp test. divide the # of factors by 25, and if it's over 1, the sieving is still a short term benefit to the project. lets say that number falls below 1, does that mean it's no longer a benefit of the project? no, because we're discounting the factoring for DC and the factors found for n > 4.8M BUT it does mean that it is only a benefit for the project in the long term. i'd say when the ratio falls below 1, it makes sense to stop sieving and start prping. save the sieving until after a prime is found. at that point, the sieve will be faster and the ratio will be improved because "next prime" will be higher.
i believe the ratio is still over 1 right now, but i don't think it will stay that way much longer. i've just run though some numbers. it's like an economic marginal benefit analysis.
assumptions:
a factor for a number between the current test TOP bound(3.8)-4.8M is a perfect substitute for a prp test of a number n=current test TOP bound-4.8M.
trends:
factor density --> rapidly decreasing, exponentially
current test TOP bound --> increasing but at a slowing rate
test length time --> increasing, but sub-exp
my data points:
SB v1.10 and SoBSieve v1.34
1GHz athlon (100MHz fsb)
current sieve range = p ~ 25T
current test size = n ~ 3.8M
sieve speed = 160k p/s = 0.576G / hour
factor density = 35 factors / 42.2G ~ 0.830 factors / G
test size = n = 3.8M ~ 221 blocks
SB speed = 160k CEMs/s = 2 blocks / hour
therefore, test time = 221 / 2 = 110.5 hours / full test
so marginal analysis would ask "how much of the perfect substitute (factors in the pre-"next prime" window) could i have if I were to sieve with my 110.5 hours instead of do the test?"
since the factor density has already fallen to less than 1 factor / G, it turns out the total thoughput for my machine is only 53 factors in the time I could do one test. correcting for the fact that 20% of those are duplicates and 95% of them are in irrelavent ranges (based on the "next prime" window), that means that only about 2 factors are expected to be immediately useful in finding the next prime.
also remember that the window is decreasing in size the closer we get to the next prime. even worse, the factor density is dropping off like crazy. I may have a factor density of 0.83 at p=25T but just look at the sieve coordination thread where people have done high ranges and reported the number of factors found:
49950-50005 chvo [complete] 25 factors found = 0.45 factors / G
50005-50100 biwema [complete] 43 factors found = 0.45 factors / G
so if the project tests up to n=4M by the time we reach p=50T, the falloff in factor density would effectively turn further sieving from both a short AND long term benefit into simply a long term one.
the point i want to make is that even if you ignore all the intanglible long term benefits of sieving such as the factors that prevent future double checks or the factors that prevent future prp tests beyond the immediate goal of the next prime you can still make a sorta "worst case" comparison where you show sieving in the dimest possible light. since it can be shown that further sieving is good even then, it must still be good *with* the extra, hard-to-qunatify benefits of sieving. heck, in a way, this almost proves the opposite of what i started off saying... that it doesn't really matter how much we sieve. if i had more faith that the n=4.8M number was a perfect prediction, it would mean that all resources should be taken off prp testing until we hit p=50T.
-Louie
just to reinforce this point some more, lets say that we did sieve up to p=50T, effectively doulbing our sieve depth. how would this effect the speed with which we find the next prime?Originally posted by jjjjL
2) whether we stop sieving now or sieve another 5000T, we won't dramatically change the rate at which primes are found.
some rough numbers would be
prp test = 110.5 hours
factor density between 25-50T = 0.65
sieve rate = 0.576G / hour
duplicate rate = 0.20
window size / total sieve width = 0.05
so that means in the time it takes to sieve from p=25T to 50T, we could have done 393 tests. however, we would have eliminated 640 tests from the window. that's a savings of 247 tests.
since SB completed over 200 regular sized tests today, even if we sieved to the more optimal level of 50T, we would only speed up the discovery of the next prime by about 1 day.
-Louie
What does your model say about stopping sieving at once? How would the time required to get to the three primes change then?
This part isn't right - it's an artifact of using the 50 percentile milestone as a substitute for finding the next prime. We have what the reliability folks call increasing residual lifetime - that is, the longer we don't find a prime, the further away the expected time becomes. For a metaphor, it's like flipping pennies looking for a head, but every time we get a tail, the weight on the coin changes to make heads less likely on the next flip. The time until failure for most semiconductors has this property - it's part of why burn-in is so effective.Originally posted by jjjjL
also remember that the window is decreasing in size the closer we get to the next prime.
As I mentioned towards the bottom of my message, I think these fixed milestones are the biggest problem with the model. I think it's also the source of the problem that the model says "double the time to find the next prime so you can save a few weels in finding the third prime." I'll get around to trying the mid-decile modification after I respond to some open questions on this thread.
that right there is why i hate these prediction methods so much. how absurd is it that the closer we get to a fixed number, the further we believe we are from it? the day we find the next prime will be the same day that this model is the most incorrect. how can we even call that a prediction model? shouldn't it be able to tell us something about the future in a reliable way?Originally posted by wblipp
increasing residual lifetime - that is, the longer we don't find a prime, the further away the expected time becomes.
explain to me how a model that is always wrong, and continues to get more wrong can be useful.
-Louie
I see that for exponents in the range of 15 million, GIMPS does trial factoring to 2^65, whiich is almost 4*10^19. They must be harder to shock.Originally posted by Nuri
5.8*10^15
That's huge... 5,800T (or 5,800,000G). If I recall correctly, the largest figure pronounced so far at this forum was 400T.
0.1% of resources devoted to sieving is sufficient to reach this. We would get to n=18 miillion in 2925 days.Originally posted by Nuri
Let's assume that we are aiming to reach p=500T at sieve when n reaches 18m at PRP. Let's also assume that we have the ability to keep a fixed percentage of our resources at sieve.
If so, what would that fixed percentage be, and when would we be reaching that?
With zero additional sieving, the time to reach the milestones would be 258, 808. and 2736 days. That's about 4 months later for reaching the third milestone.Originally posted by garo
What does your model say about stopping sieving at once? How would the time required to get to the three primes change then?
Wblipp,
According to your model how does the time for the 50% probability completion time of the 17th prime change with changes in resources between sieve and prp. Would the differances be much more prevalent than in the lower numbers. I'm assuming that once we sieve up to a certain p value we will start sieveing for above 20,000,000 but i also understand this is going to be far off in the future. Maybe your model could help us to determine a best time ot switch over for a maximum benefit to the project. Mainly I'd like to know how much the sieve is helping ot bring down to a more reasonable level the completion of the project. After all I'd like to be alive to be at the celebration party when the 17th is found.
could P-1 be incorporated and maybe make a bigger difference?
I've been trying ot find a response to this a while, I've even tried ot figure out some of the math and build my own model (I failed miserable) But if someone could let me know, I know that the sieving doesn't matter much for the next few prime discoveries but for the completion it is my belief that it will have a much larger effect. PLease someone offer some input here
Factors of mersenne numbers are of the form 2*K*P+1I see that for exponents in the range of 15 million, GIMPS does trial factoring to 2^65, which is almost 4*10^19. They must be harder to shock.
For a single P, not every K produces a prime, and even if it does, it needs to be 1 or 7 mod 8 (IIRC). So the won't have to test every primefactor below 4*10^19, but just a very small fraction ( < 1% ??).
The larger P gets, the faster trailfactoring goes.
right, if you just take the 2*p out of the picture, they have a 2^25 sieving advantage over us due solely to the form of their factors. That means that us sieving to 2^45 would equate to GIMPS sieving to 2^70. However, they have only sieved to 2^65 (for the range originally being commented on.)Originally posted by smh
Factors of mersenne numbers are of the form 2*K*P+1
For a single P, not every K produces a prime, and even if it does, it needs to be 1 or 7 mod 8 (IIRC). So the won't have to test every primefactor below 4*10^19, but just a very small fraction ( < 1% ??).
The larger P gets, the faster trailfactoring goes.
This basically means that the parallel nature of our sieve makes it many times more efficient than the trial factoring done by GIMPS... they just happen to have a nice factor form that makes it appear as though theirs is more efficient.
-Louie
has the final analysis been completed or is someone waiting for some more data, or trying to find time to do it around their day job?
I and probably some of the rest of us would be interested.
The thread above has some analysis, saying basically that more sieving would only subtract a few days, is this definitive? I did read above somewhere that some people (Louie I think) did not like some of the "artifacts" produced by the type of model and/or that the model threw out some starnge stuff.
*bump* because I'm hoping for new eyes and opinions.
Whew, all this is way too complicated.
I liked the easy model: as long as we get more factor throughput then can run tests, we should sieve (adjusted with some constants). According to this one, we should sieve more. We had some discussions like this some time ago.
Unfortunately, everything has even become more complicated by the threshhold of proth sieve which forces us to doublecheck until the moment we decide that we need new factors now and continue FP sieve even with slower software (which will happen at 300G? 500G?)
BTW, why the hell 2^50? 2^48 or 2^64 I would understand, but 2^50? There must be a reason, though. Does somebody know why this problem cannot be fixed by changing 50 to 51 in the sourcecode somewhere?
And then, if we join with PSP, it becomes even more complicated.
I hope that in V3, there will be an option to factor the number before PSP, so that at least this question disappears.
Sieving is still for people who have a particular interest, read the forum etc., and there is no way to convince the "only-client" people to join sieve. That's no problem though. We can support them by sieving as much as we can.
Anyway, is there an allocation problem? We know what would be better (more sieving).
The only interest I see is in a model in order to join the statistics of PRP and sieving/factoring one day. Is there such an intention? Don't know.
I'm not interested in personal stats, so I will leave this alone anyway.
Sorry for my bad thougts. Yours H.
The 2^50 limit comes from a trick using the floating point unit to help perform the mod-mul operation in the baby-steps giant-steps algorithm.
Because the x86 FPU is natively 80-bit (unlike others which are 64-bit) the mantissa component is large enough (64-bit) there is space to perform a relatively accurate mulmod of up to 60bit numbers.
Phil Carmody's post on sci.crypt
The code (in C, fixed from an incorrect version posted in the thread) is:-
__int64 mulmod(__int64 x, __int64 y, __int64 p)
{
__int64 res,xyl = x * y;
double xyf = (double)x * (double)y / (double)p;
__int64 r = p * __int64(xyf); // CRITICAL
res=ryl-r;
if( res < 0 ) res+=p;
if( res >= p ) res-=p;
return(res);
}
There is also some benefit to pre-computing:
double recip_p=1.0/(double)p;
and changing the third line to:-
double xyf = (double)x * recip_p * (double)y;
Because we are using x86 we could extend the sieving up to 62 bits (lets say 60 to be safe). That is if proth_sieve is implemented in the same way that Phil Carmody implemented his.
Given that it has taken a considerable time (over 2 years) to sieve to 2^50 (and we aren't even there yet) then another 10 bits (to 2^60) should keep us busy for another 2048 years (given same participation level and processor speed).