Howard;
I've done a bit of elementary stats work on our top 10 results. A few things come out of this:
1) The best rmsd reported on the website in the top 10 list differs from that posted within the text details of the generation. I assume from this that you're running your own algorithm on the the best ones and coming up with your own #. Since we're spending so much time on a single structure sometimes, why aren't we using the same algorithm?
2) # 9 on the list is seriously whacked. The top 10 list says 175 gens have been completed but the text details only says 66.
3) 7 out of the top 10 have data for beyond gen 100. Of these 7 only 2 found their best results near the end of their efforts. On average, the clients found their best result at 58% of their total effort. If you remove the two that did best towards the end from the equation it drops to 44% effort.
My proposal is as follows:
1) Make the # of generations dynamic based on the results of the gen.
2) Always fold out to 100 gens.
3) take the best result from the 100 and keep going another 20%. If within that 20% you do not find a better fold, abandon the set and start all over again.
4) If you do find a better one within the 20%, go out another 20%. This way we will only waste about 20% of our effort.
By my calculations the two sets in the top 10 that found their best result towards the end of their work so far would still have been included. But the others would have given up sooner which we now know to be a good thing.
This is similar to a suggestion I made a while back, but now I have some stats to back up my position.
ms