Howard;
I've been looking at the crease energy (which I understand is our best method of picking the best conformation) and comparing it to the rmsd. When examining these it becomes very clear that we are handling the sampling side of the folding problem much better than we are handling the scoring side.
I say this because the rmsd graphs tend to go down in a nice basically even curve, from sharply vertical to basically horizontal, which is what you would expect based on a consistent scoring method. However, the crease energy over the same set of results is varying wildly. I conclude from this that not only is our yardstick for measuring the elevation of this spot on the world inaccurate (using the find the lowest spot on the earth analogy), it is also highly inconsistent - that is if you measure two spots that are pretty close in elevation, our measuring tool will often give highly different results.
To make matters worse, I'm pretty sure reality is actually worse than the graphs portray simply because we're using the rmsd to eliminate all of the worst results from each generation right now. So when we go ab initio and use crease energy we'll probably be throwing out the best value of each generation. So, I now think I understand why you said earlier in reference to the CASP5 results that it was clear we needed a better scoring algorithm.
It seems to me, however, that our guided sampling method is now so far ahead of the scoring that it is virtually pointless to continue to fine tune the sampling side until the scoring side catches up.
I'm assuming that you have reached similar conclusions, and yet to all appearances you still seem to be fine tuning the sampling method. I conclude from this that you don't know how to improve the scoring functions, but you have effort available so you might as well apply it usefully somewhere and the only place to do so is the sampling side.
I don't mean to be totally discouraging here (but I must admit I'm feeling that way myself), but it seems to me that we either need to find a way to start focussing our efforts on the areas that really matter (scoring) or wait until some bright guy comes up with an idea to do so.
One thing we could do, I suppose, is to apply all of our current scoring methods to every structure we sample of a know protein and recording all the results and then do some analysis to see which algorithms do best and if we can mix and match approaches from within one algorithm and apply it inside another. Yes, this would vastly decrease our sample size, but we already know we can handle this side of things once we get the scoring worked out.
Thoughts?
ms