Brian the Roman
01-23-2003, 11:03 PM
Well Howard, you're in trouble now - you got me thinking! :confused:
Divide the folding problem into 2 pieces - figuring out all the possible shapes and determining the energies for each shape.
I tend to think of all of the possible shapes for a single protein as being a very large 2 dimensional (x,y) cartesian graph. Now, if you add a third dimesion (z) and make that the energy value of the protein then you end up with a large rectangle with hills and valleys. Now, our problem is that we don't know what the energy values are for any of the points x,y.
It seems to me that one of the real problems is that the best method of finding the lowest of all valleys depends upon the general shape of the hills and valleys. For instance, if every 4th x,y position was a local minima, then exploring around the valleys wouldn't help. On the other hand, if there was only one minima in the entire space and the entire space was tilted towards it, then it should be fairly easy for an algorithm to head directly for the lowest energy co-ordinate simply by looking at the lowest energy neighbour.
What this implies is that we need to get a sense of what the 'energy landscape' looks like. It should be fairly easy to come up with an algorithm which can convert any protein shape into an x,y co-ordinate based upon the protein definition. Likewise it would be easy to change it back from an x,y co-ordinate into one and only one shape. What this does is it now allows all communications regarding the structures to be done as a simple x,y co-ordinate instead of the structure description - much lower bandwidth requirements. Next, given that we have already calculated over 10 billion structures we could use that data, map it to the x,y co-oridinates and then build a tool to 'look at the result (could even be as simple as a 3d view of the hills and valleys). This would allow us to learn from the conformational space of one protein which we would sample extensively, and apply it to other proteins and see if their energy landscapes share the same basic characteristics.
Another thing you could do using x,y to specify the structures is to have the server provide each client with an area to search and a 'sampling factor' which would specify how thoroughly to search the specified area. This would allow you to ensure that the entire conformational space was being searched evenly, and by 'connecting the dots' we could interpolate the values between the known points. Or you could focus the efforts on certain 'promising' areas.
A third benefit of this approach is that it would be REALLY cool to be able to go to your website and pull up a 3d image of the conformational space of the proteins we have worked on, with interpolation lines between the known structures.
Now I realize that I know very little about all this. You guys have probably thought of most of these types of things already. But it would be cool to be able to see the 'energy landscape' graphically.
mike s
Divide the folding problem into 2 pieces - figuring out all the possible shapes and determining the energies for each shape.
I tend to think of all of the possible shapes for a single protein as being a very large 2 dimensional (x,y) cartesian graph. Now, if you add a third dimesion (z) and make that the energy value of the protein then you end up with a large rectangle with hills and valleys. Now, our problem is that we don't know what the energy values are for any of the points x,y.
It seems to me that one of the real problems is that the best method of finding the lowest of all valleys depends upon the general shape of the hills and valleys. For instance, if every 4th x,y position was a local minima, then exploring around the valleys wouldn't help. On the other hand, if there was only one minima in the entire space and the entire space was tilted towards it, then it should be fairly easy for an algorithm to head directly for the lowest energy co-ordinate simply by looking at the lowest energy neighbour.
What this implies is that we need to get a sense of what the 'energy landscape' looks like. It should be fairly easy to come up with an algorithm which can convert any protein shape into an x,y co-ordinate based upon the protein definition. Likewise it would be easy to change it back from an x,y co-ordinate into one and only one shape. What this does is it now allows all communications regarding the structures to be done as a simple x,y co-ordinate instead of the structure description - much lower bandwidth requirements. Next, given that we have already calculated over 10 billion structures we could use that data, map it to the x,y co-oridinates and then build a tool to 'look at the result (could even be as simple as a 3d view of the hills and valleys). This would allow us to learn from the conformational space of one protein which we would sample extensively, and apply it to other proteins and see if their energy landscapes share the same basic characteristics.
Another thing you could do using x,y to specify the structures is to have the server provide each client with an area to search and a 'sampling factor' which would specify how thoroughly to search the specified area. This would allow you to ensure that the entire conformational space was being searched evenly, and by 'connecting the dots' we could interpolate the values between the known points. Or you could focus the efforts on certain 'promising' areas.
A third benefit of this approach is that it would be REALLY cool to be able to go to your website and pull up a 3d image of the conformational space of the proteins we have worked on, with interpolation lines between the known structures.
Now I realize that I know very little about all this. You guys have probably thought of most of these types of things already. But it would be cool to be able to see the 'energy landscape' graphically.
mike s