new algorithm... [Archive]

Brian the Roman

01-23-2003, 11:03 PM

Well Howard, you're in trouble now - you got me thinking! :confused:

Divide the folding problem into 2 pieces - figuring out all the possible shapes and determining the energies for each shape.

I tend to think of all of the possible shapes for a single protein as being a very large 2 dimensional (x,y) cartesian graph. Now, if you add a third dimesion (z) and make that the energy value of the protein then you end up with a large rectangle with hills and valleys. Now, our problem is that we don't know what the energy values are for any of the points x,y.

It seems to me that one of the real problems is that the best method of finding the lowest of all valleys depends upon the general shape of the hills and valleys. For instance, if every 4th x,y position was a local minima, then exploring around the valleys wouldn't help. On the other hand, if there was only one minima in the entire space and the entire space was tilted towards it, then it should be fairly easy for an algorithm to head directly for the lowest energy co-ordinate simply by looking at the lowest energy neighbour.

What this implies is that we need to get a sense of what the 'energy landscape' looks like. It should be fairly easy to come up with an algorithm which can convert any protein shape into an x,y co-ordinate based upon the protein definition. Likewise it would be easy to change it back from an x,y co-ordinate into one and only one shape. What this does is it now allows all communications regarding the structures to be done as a simple x,y co-ordinate instead of the structure description - much lower bandwidth requirements. Next, given that we have already calculated over 10 billion structures we could use that data, map it to the x,y co-oridinates and then build a tool to 'look at the result (could even be as simple as a 3d view of the hills and valleys). This would allow us to learn from the conformational space of one protein which we would sample extensively, and apply it to other proteins and see if their energy landscapes share the same basic characteristics.

Another thing you could do using x,y to specify the structures is to have the server provide each client with an area to search and a 'sampling factor' which would specify how thoroughly to search the specified area. This would allow you to ensure that the entire conformational space was being searched evenly, and by 'connecting the dots' we could interpolate the values between the known points. Or you could focus the efforts on certain 'promising' areas.

A third benefit of this approach is that it would be REALLY cool to be able to go to your website and pull up a 3d image of the conformational space of the proteins we have worked on, with interpolation lines between the known structures.

Now I realize that I know very little about all this. You guys have probably thought of most of these types of things already. But it would be cool to be able to see the 'energy landscape' graphically.

mike s

Scoofy12

01-23-2003, 11:29 PM

good thinking... in fact, they have already done a similar kind of mapping. check out http://bioinfo.mshri.on.ca/trades/

its their way of mapping the conformational space.

Brian the Fist

01-24-2003, 10:25 AM

Unfortunately the energy landscape is far too complex to plot, not to mention that it has several hundred dimensions, not two. It is also far too large to hope to cover exhaustively; even with 10 billion samples we are just scratching the surface.
Thus other, more powerful ways are needed to explore conformational space.

If you look in the Results section on the web site, we DO show some energy vs. RMSD plots which at least gives a very crude picture of what the landscape is like.

RaginSteveK

01-25-2003, 12:59 PM

Originally posted by Brian the Fist
Unfortunately the energy landscape is far too complex to plot, not to mention that it has several hundred dimensions, not two. It is also far too large to hope to cover exhaustively; even with 10 billion samples we are just scratching the surface.
Thus other, more powerful ways are needed to explore conformational space.

If you look in the Results section on the web site, we DO show some energy vs. RMSD plots which at least gives a very crude picture of what the landscape is like.

but,..Fist.. we've shown that we can do 10 exp[10] in a veritable heartbeat..

what kind of energy [ energies] : formation, rotation, conformation ???

remember, in my drunken youth, trying to make some sense outa PChem's struggling with polymeric monolayers.. with screwball coefficients- each a journal page long, relating to " properties" or state functions that had no physical significance[ ??? ]

use them ..:{ )
[ J Chem Physics, early 70's comes to mind as the worse offender ]

:bang: