PDA

View Full Version : Speeding up the PRODUCTIVE crunching



Scott Jensen
04-17-2002, 11:58 AM
Assumption #1: What's sought is the smallest structures possible.

Assumption #2: All other structures are discarded.

If the above is true, how about building into the client program a kill command that restarts it when it exceeds the current smallest known structure for the protein being worked on?

Each time the client program uploads to the server, it checks what's the smallest structure currently known for the protein being worked on. All structures that start to go beyond that are then terminated, discarded, and the program starts over again. This could possibly at least double the productiveness of the client programs.

wirthi
04-17-2002, 05:59 PM
Hi,

As far as I know we are not searching for the smalles structure but for the "best", the one that could possibly occur in nature (or the one that can be reproduced easiest, I don't know)

Greets,
Christian

Aegion
04-17-2002, 06:07 PM
I'll let Brian get into the full explanation, but your assumptions are wrong here. The "size" of the structure is actually the variation of the simulated protein from the actual shape of the folded protein. Generally the most rapidly processed structures are not the correct shape, and ones that take longer are more likely to be accurate.

Scott Jensen
04-17-2002, 07:35 PM
I thought smallest was the one sought since two of the stats that are always given when I check my stats are:

"Your smallest RMSD structure"

and

"Overall smallest RMSD structure"

As well as the stat chart that's titled: "Best structures generated to date" and those are the smallest structures generated to date.

Why give such emphasis to smallest structures if this isn't what's sought?

Aegion
04-17-2002, 07:54 PM
Originally posted by Scott Jensen
I thought smallest was the one sought since two of the stats that are always given when I check my stats are:

"Your smallest RMSD structure"

and

"Overall smallest RMSD structure"

As well as the stat chart that's titled: "Best structures generated to date" and those are the smallest structures generated to date.

Why give such emphasis to smallest structures if this isn't what's sought?

I believe "smallest structure" means smallest degree of variation from the actual folded protein. I'll wait for Brian to give a detailed explanation of the exact science behind this.

Brian the Fist
04-17-2002, 07:56 PM
Smallest RMSD structure does not mean smallest structure. Much of the science involved is described both on the folding website and on our TRADES page (bioinfo.mshri.on.ca/trades) so please look through there as Im not going to give a detailed explanation here.

Briefly though, RMSD is a measure of similarity between a 'pseudo-random' structure and a 'true' structure. The proteins we are generating right now already have known structures, so the RMSD indicates how close to the correct structure we are getting (i.e. WE know the correct structure, but the software obviously doesn't use this information to samples conformational space). The smaller the RMSD, the closer to 'correct' the structure is.

An RMSD of 0.0 means the structures are identical. 2.0 or less is generally close enough to start designing drugs to target that protein. 6.0 means the general topologies will be similar. 10.0 will generally have little recognizable similarity. RMSD also scales with the size of a protein (# of amino acids) so an RMSD of 6.0 for a protein which is 100 AA is more meaninful than an RMSD of 6.0 for a small 30 residue protein, for example.

jkeating
04-17-2002, 11:42 PM
RMSD also scales with the size of a protein (# of amino acids) so an RMSD of 6.0 for a protein which is 100 AA is more meaninful than an RMSD of 6.0 for a small 30 residue protein, for example.

So, if we were to pick a protein with say.... ohhhh, just to pick a random number - 76AA :D , how low a RMSD would you be hoping to get among, say 10,000,000,000 structures?

To be blunt, what are we shooting for with THIS protein?

Scott Jensen
04-18-2002, 03:26 AM
How about adding a simple sentence to the stat page that would explain this to prevent further misunderstandings?