PDA

View Full Version : An improvement for CASP 5?



SpongeBob SquarePants
04-12-2002, 05:47 PM
NI!

Howard,

:confused: :confused: :confused: :confused:

I may be a totally wet Sponge here, but would it not be better to instead of us randomly brute strength forcing our way to low RMS, instead use the previous work to base new work results on?

I may be wrong on this, but it seems that we are "infinite monkey" -ing this. I hope I am wrong.

Maybe you could explain in broad brush strokes how the project generates random seeds, and how they help.

Instead of random seeds, if my shrubbers were to say start at the last-best-lowest RMS structure as a starting point. They would be focusing in on a more accurate result.

I am not describing this very well but here goes:

(Picture the game MasterMind. you do not know the color(AA) or placement (order) but through process of elimination you can devine the correct answer. In our case we have 4 colors with 62 places)

When my 5,000 or 10,000 units are uploaded, the program tells me to crunch the next 5,000 but places a fixed result in the first AA. We will call it A, (A place1), When Y2K+1Guy gets his units he is told to place a fixed reult in the second AA Aplace2. We turn in our results, and whoever has the lower RMS disqualifies the other. That lower RMS unit is sent to Worker Bee #3, and so on. We try the other colors (AA's) in place one, then move to place 2. We keep going until we have all 4 colors in all 62 places.

(For all you math folks is the total possible combination 4^62?)
(If so then we get one right it is 4^61 Etc.....?)

This may be what we are doing now. Maybe some more scientific/ programmers types could explain this to little ol' me.

I am very competitive and really want you guys to win this.

A Curious SpongeBob SquarePants



P.S. If when we are going for a blind test for CASP 5 how do we know what the best RMS is?

:confused:

bwkaz
04-13-2002, 09:08 AM
Well I can answer your P.S.

They don't.

What I think I remember Howard saying they're doing now (or maybe it was on the DF website) is just trying to find a correlation between low RMS deviation and low energy (or something like energy). That way, if that correlation is good enough (that is, close enough to 1 -- correlations from I think 0 to 1/2 are pretty bad, from 1/2 to about 0.9 is decent, and higher than 0.9 is great; it can't get above 1), then for CASP, they can just run the clients for however long it takes, and submit the ten (or whatever) lowest-energy proteins.

Howard, can you confirm this? Or is my memory failing me? ;)

Brian the Fist
04-13-2002, 11:22 AM
Yep, thats a fairly accurate description. Your crunchers are all computing energies (behind your backs :shocked: ) of the structures being generated and sending those to us as well. Once we post results you'll see the plots of energy vs. RMSD.

Also just so you know, structures are not chosen completely at random. There is some bias based on known structures already built in to the method (see http://bioinfo.mshri.on.ca/trades if you are feeling adventurous :crazy: ).

We intend to enchance the algorithm further before CASP as well though, perhaps adding a genetic-algorithm type approach to it, but that all depends if we can get that done and working effectively in time..

We are very excited about participating in CASP this year though as Im sure at least some of you may be ;)