You are indeed correct and I agree with most of what you say. However, we have found that the crease energy can perform better than it may initially appear if used properly. Just for reference, the crease energy of the native structure for the protein in the beta test is -4800 which none of the 5A RMSD structure have gotten close to yet. It is a great fallacy to draw conclusions on a scoring function after testing it on just a single protein.
We have thoroughly tested all our current scoring functions on a set of 17 different proteins of distinct folds to get a better idea of how they perform. Nevertheless crease energy is far from ideal.
However, as you may or may not know, this project is for the most part a chunk of my PhD thesis, which deals mostly with the sampling problem of protein folding. While I look at scoring as well, it being inextricably bound to sampling, it is not my major focus. That said, others in the lab are looking at the scoring problem in more detail and their findings incorporated into DFP as our scoring functions are tested and improved.
The beauty of the algorithm is it is trivial to 'plugin' a new scoring function, once we have one to try, without changing anything else in the algorithm. This will allow for rapid testing of new scoring functions on a large sample set after our preliminary testing on smaller sets of various protein folds.