I'm thinking about ways to build such a model, and wondering about the data to drive it. I figure other people may be noodling about this, too, so a place to share information would be useful. Here are some of things I think could be useful in the model.Originally posted by jjjjL
I have this naging feeling that a stunningly beutiful model could be constructed that took into account the likelihood of prime discoveries by n-level, sieve speed, expected sieve speedup, and prp speed to properly calculate the weighting of cpu between sieving and prping. ... give me your thoughts on all this.
-Louie
1. I can find the current distribution of remaining candidates in MikeH's sieve statistics.
2. The time for a "standard PC" to complete a primality test - 3 days?
3. The way primality testing will slow down with increasing n - will n's that are twice as large require twice as long? 4 times? sqrt(2)? ln(2)?
4. The rate at which a "standard PC" can sieve at the present level - 1G per hour? And the way this will change with p. Doubling the p value decreases the number of primes in a range, but it also, I assume, increases the time to test each p. The Prime Number Theorem tells us the first component - but how does the time per test change?
5. The rate at which factors will be found by sieving at various p levels - I think we can estimate this from Merten's Theorem and the number of remaining candidates.
6. The rate at which primes are found among the candidates - I think we can use the Proth-Weight adjusted prime number theorem for this. A full probabilitistic model is probably complex - a reasonable first cut could use the average n-value for finding the next primes.
7. The speedup of sieving with fewer k's to sieve - linear but with a fixed overhead?
8. The increase in resources from the publciity of finding a new prime - 20%?
9. The increase in resources during periods of no new primes - for purposes of optimal allocation this probably doesn't matter.
I'm presently thinking about applying the above parameters to a system of differential equations. p(t) would be the Prime threshold for sieving, n(t) would be the n threshold for primality testing. These equations could be solved for fixed proportions of resources for sieving and primality testing. (For variable allocations, too, if we have any idea about how vary them.) A longer term planning horizon is probably appropriate - perhaps minimizing the predicted time until four more primes are found.
Longer term there are other resource allocation problems that I think we can ignore at this time. I'd ignore the question of when to begin sieving beyond n=20M - it seem clear that deeper sieving on the present range has vastly greater payoff. I'd ignore the question of double-checking - my explorations of this suggest it is MUCH too early to devote serious resources to this, although to be sure it would be noce to know the answer to 10. Do we have any validated estimate of the error probability of reported results?
Please jump in with data and ideas.