"What are the odds of knocking off the current king?" goes right to the heart of the problem as I see it. From my inadequate understanding, Howards best explaination of how this all works suggests that the RMSA's he is getting right now are wildly improbable. It is on the order of winning the lottery type improbable. But he has "won the lottery" several times in a row now. If somebody won the real lottery 5 times in a row, most people would start wondering why it is so easy to win.

Does the chances of knocking off the current king approach zero? My rule of thumb estimate says no. There are still pretty frequent changes to the top ten. Since the numbers returned from theory don't match the current experimental numbers, the theory may be too incomplete to use for such predictions. As Howard says, given the number of structures crunched to date, the best RMSA SHOULD be around 6.16. Personally, mine is 5.59 and I have run only 800,000 structures. If these results are validated, it is a major discovery. In essence, it implys that somebody running a handful of computers could get a good enough structure within a few weeks.

He has a few things to get done before he celebrates. He has to validate that nobody cheated with the structures sent in to him. He has to make damn sure he isnt getting some seriously goofball result deep in his software that is screwing up the numbers. Next, he must come up with a very convincing explaination as to why the software is 300,000 times better than theoretically expected. This last one may not be easy.

He has a good solid theoretical basis for believing the DF approach, if given billions of structures, will find a close enough structure (defined as less than 6.0 RMSA) His calculations suggest that the number of structures required are roughly:

Small: 1 Billion
Medium: 10 Billion
Large: If it works on small and medium he can get lots of computers.

Considering the nature of the project and his answers to the "current king" problem, he probably expected the results would be something like 5.5 as the average "winning" RMSA with a couple of hundred non-winning structures below 6.0. As an experienced researcher, he also probably tossed in a hefty fudge factor.

Instead of behaving as planned so he could write the dissertation, the experiment has returned some unexpected results.

Small: Thousands of RMSA way way way below 6.0. Like 2.03
Medium: RMSA below 5.0 with less than 20% of structures complete. Thousands of structures at less than 6.0.

This is the classic rasion d'etre of experimental science. To find the point where your results are significantly different from theoretical expectations.


Remember, Howard is a grad student, working on his doctoral thesis. If he can survive the garanteed incredibly brutal examination of his answers to the above problems, he gets to be a PhD.

My bet, is that he will be able to show his code isn't wacked. It would seem that cheating is unlikely since his method requires proof of work completed. So the really hairy problem for Howard is explaining why his software is 300,000 times faster than expected. It is not likely to be just his fudge factor. Either there is something fundemental missing in folding theory on the biology side or something fundemental missing in the computer science. Or both. I also bet that Howard will wait for CASP 5 before committing himself completely. As a completely blind test he can validate the software. If it does well, he wouldn't need an explaination for the results, he would have independant substantiation.