PDA

View Full Version : New scoring function



Michael H.W. Weber
07-17-2002, 06:32 AM
Please correct if I have missed something. :D

From the "whatsnew.txt" file it is evident that, with the new CASP5 target, the scoring function - which is used to select the presumably best structure model from the bulk of generated protein structures - has been optimized:

"- pseudo-energy now uses EEF1 solvation term instead of 'crease' energy; this takes a little longer to compute but we have found is more accurate and reliable, hopefully leading to better structure predictions for CASP"

After the last update, the following three new files appeared in my DF work folder:

param19_eef1.inp
solvpar.inp
toph19_eef1.inp

I assume that these are the basis for the scoring/selection of the generated structures of which ONLY the best will be uploaded. The fact that this function had to be optimized implies that the preceding function was suboptimal. :rolleyes: Since the selection of the best structure to be uploaded to the DF server is carried out on the local client machine, I must conclude that all past results generated by DF throughout this CASP5 experiment are consequently suboptimal.
The good news is, however, that obviously we have learned something important about scoring functions. Also, I am still hopeful that the possibly suboptimal structures generated for the previous CASP5 targets are still better than those that will be submitted by the competing work groups. :D

Therefore my question: On the average (if one can say so), how much difference in each of the structure models can be expected when comparing those selected by the old scoring function to those that will be selected by the new one?

All the best,
Michael.

Brian the Fist
07-17-2002, 10:34 AM
You are correct in most of your assumptions. However the 'new' scoring function is not an optimized version of the previous one. It is a completely different one, based on different principles entirely. Though they both attempt to measure energy of course.

In our tests with sampling known structures, we have recently found that the new function identifies the 'best' random structures somewhat better than the crease did. This is not to say that either was poor though. We are in fact purposely switching now, about halfway through CASP, so that we can see which set of proteins we do better on - those scored with crease and those scored with EEF1. It is quite possible that on average they will do equally well but we won't know for sure until December when the CASP results are available.