New proteine

**Marcelloz** · 01-10-2004, 09:11 AM

I just can't wait as i'm too curious

:
what's the next proteine going to be like? How will it compare to what we got now? Is is bigger or smaller?

**iggy** · 01-10-2004, 01:14 PM

I hope it will be at least three times the size of the one we are crunching now!

**Fozzie** · 01-10-2004, 02:45 PM

and they halve the points too eh Iggy

**ColdFusion** · 01-10-2004, 05:02 PM

Chickens

**iggy** · 01-10-2004, 06:07 PM

and they halve the points too eh Iggy

What a good idea!

Chickens

Not really. Just trying to implement a bit of (wishfull) practical thinking!

Well, whatever happens, I'm glad that there is one more team that can pull it off as such, together.

If DPC continues folding like this, we'll have to change proteins every two weeks... I don't think Howard will be extremely amused with it

**Grumpy** · 01-10-2004, 06:10 PM

All those chickens from Teams on position 3,4 & 5 follow me to the pub for the toast to the massive Protein we hope to get

**Hagar** · 01-10-2004, 08:39 PM

Originally posted by iggy
What a good idea!

Not really. Just trying to implement a bit of (wishfull) practical thinking!

Well, whatever happens, I'm glad that there is one more team that can pull it off as such, together.

If DPC continues folding like this, we'll have to change proteins every two weeks... I don't think Howard will be extremely amused with it

He can always double the 'goal amount of work'

No reason to panic Howard, just give us that small proteine

**iggy** · 01-11-2004, 11:17 AM

Whatever the protein is going to be, one thing is obvious: we are not getting any good results from new algorithm, no matter how many structures are being processed...

**Paratima** · 01-11-2004, 11:26 AM

Originally posted by iggy
Whatever the protein is going to be, one thing is obvious: we are not getting any good results from new algorithm, no matter how many structures are being processed...

Ahhh. You'd be a protein scientist, then?

**Anteraan** · 01-11-2004, 01:47 PM

Well, I'm no protein scientist either, but after over 6.3 billion structures, it's true that we're still stuck at 8.36 A (which kind of looks like a freak occurrence when you look at the rest of the top 10). Also, this is a fairly large protein (in terms of this project), and it seems reasonable that a smaller protein would provide a better opportunity for a low RMS. Like I said, I'm no protein scientist, so if that statement is complete bunk, someone come in and correct me.

Still, based on the two sets of results on this same protein, it is clear that this newer algorithm is an improvement over the previous algorithm, both in terms of RMS (modest) and speed of processing (huge). So, in that sense, progress is being made, and I believe it would be premature to pass judgment on the new algorithm until it gets a few more runs on different proteins. Howard also indicated that he has a number of other ideas for improving, so keep that in mind.

On occasion, I'm forced to take comfort in the words from one of my graduate instructors, who once told me, "Sometimes, good science seems to move slowly."

**AMD_is_logical** · 01-11-2004, 02:41 PM

Originally posted by Anteraan
Well, I'm no protein scientist either, but after over 6.3 billion structures, it's true that we're still stuck at 8.36 A (which kind of looks like a freak occurrence when you look at the rest of the top 10). Also, this is a fairly large protein (in terms of this project), and it seems reasonable that a smaller protein would provide a better opportunity for a low RMS. Like I said, I'm no protein scientist, so if that statement is complete bunk, someone come in and correct me.

Actually, you are correct. An 8.36A RMSD after so many structures is horrible. I would expect better if you just wadded the protein up at random that many times. (It's less than 6 billion, though, as there are fewer structures than points now.)

Nature evolved proteins to meet two criteria. First, it must be useful after it has been properly folded. Second, it must fold to it's useful form.

The second criteria is very important. A protein is useless (or worse) if it isn't folded correctly. Mutations that help a protein follow a folding path to the correct final form are benificial, so proteins will evolve to fold correctly under natural conditions.

There may be a huge number of folds that have low energy, and otherwise look "good". But the fold that counts is the one that the protein evolved to reach naturally. Thus, the only good way to reach the natural fold is to mimic the natural energy function and the way the protein reacts to it. The current algorithm doesn't even come close to that.

I think a good approach would be to start with a small protein whose folding pathway is well known. Then adjust the energy function and the response to it until the program can reliably fold that protein in a natural manner (but without getting stuck in local minima). It should reach near-zero RMSD most of the time, just like the protein does in nature. The program could then be fine tuned to fold more complicated proteins.

**tpdooley** · 01-11-2004, 04:29 PM

Howard recently posted the value the client was able to reach with this protein when we were comparing RMSD of each structure locally. (It was 6.+ A, if I remember correctly) The goal of this project is to be able to generate a useable structure without having a native structure to compare it to. The low energy grading has yet to reach the ability of the local RMSD grading at coming up with a low RMSD result.

When we were dealing with the beta for the Phase II client, we tested 6 or 8 different algorithms. So far, with the low energy approach, we've gone through 2 - and hopefully given Howard & Dr. Hogue enough raw data to allow the client to be tweaked a few more times and get closer or even better than the local RMSD grading approach.

**shortfinal** · 01-12-2004, 10:57 AM

Originally posted by Anteraan
Well, I'm no protein scientist either, but after over 6.3 billion structures, [snip]

I just want to point out, unless I'm mistaken, that it's 6.3 billion points and not structures. I don't remember what the formula Howard is using to calculate points but there have been fewer than 6.3 billion structures generated. Anyone care to try to calculate it? I'm too lazy.

Shortfinal

**Anteraan** · 01-12-2004, 12:33 PM

Yes, that is correct, shortfinal. As you (and AMD_is_logical before you, I might add) pointed out, I made an error in mentioning structures, not points. Both of you are correct, and I was not.

For reference, the scoring formula is points = INT[100*sqr(gen #)]. From that, even I can see that there are far more structures than there are points.

To calculate them, I'll go for a rough estimate:

1 set of 250 gens = 264,270 points = 25,000 structures, which comes out to roughly 10.5708 points per structure

Currently, the Blueprint site is showing roughly 6,682,085,531 points, so divided by the 10.5708 points/struc, that places us in the neighborhood of 632,126,758 structures. This is assuming full sets of 250 for everything turned in, which is most certainly not the case, so that means, in reality, there will be more structures than that completed. How many more, I can't say. But yes, it's less than 6 billion,

probably closer to 1 billion.

**iggy** · 01-13-2004, 02:48 PM

From DFs site news page:

01/13/2004
About the new protein
The new protein is 105 residues long, and it is one we have worked on in Phase I. It is being used in order to compare results for the new algorithm. For the time being, we are keeping generation size at 100 structures.

**FoBoT** · 01-13-2004, 03:24 PM

i have been running it for 2 hours on a pentium III 1.06 Ghz laptop and have 8600 structures of gen. 0

**shortfinal** · 01-13-2004, 03:34 PM

Running Tru64 UNIX for a little over 1 hour and got this in progress.txt:

Building structure 96 generation 2
4 until next generation
3 generations buffered
Best Energy so far: 1.278

Did we get values this low on this protein last time?

Shortfinal

**Paratima** · 01-13-2004, 04:37 PM

Dunno. I've got a Win2K box where it's been showing 9996/10000 units completed, for the last 20 minutes!

This is worrying.

**iggy** · 01-13-2004, 04:53 PM

Originally posted by Paratima
Dunno. I've got a Win2K box where it's been showing 9996/10000 units completed, for the last 20 minutes!

This is worrying.

Trying to upload buffered generations?

I had to update all systems manually - autoupdate didn't work, and my daemon couldn't get the latest patch...

Btw, NF2 + AXP @ 2200 MHz finishes 10,000 structures in generation 0 in about 37 minutes.

**Paratima** · 01-13-2004, 05:05 PM

No, this was a clean install with a new distribfold-current-win9x.zip.

Think I'll plug it in on a Linux box (well, not the same file, obviously) & see what happens....

**gistech1978** · 01-13-2004, 05:23 PM

low energy already!
6.290 with 500 of the first 10,000 left
not too shabby?

**iggy** · 01-13-2004, 05:45 PM

Yes, low energies, but RMS is still calculated at server side. I've got -0.010 on one system, but the best RMS is 13.78...

**Grumpy** · 01-13-2004, 06:29 PM

Looking at the top 10 RMS list is like paying a visit to Old MacDonald's Farm..so many

cows.

**Gyske** · 01-13-2004, 07:18 PM

Moooooh!

This will be changing though, lots of people over here are upgrading and installing, as our Stampede has just begon.
Lots of other users are probably not in as much a hurry a we are.

**Ned** · 01-29-2004, 10:47 PM

Ok, if my calculations are correct, we'll reach our target on Feb. 5th...

So, does that mean that we'll get a new protein on Feb. 5th, or do we wait until the following Tuesday?

Inquiring minds would like to plan !!!

Ned

Thread: New proteine

Thread Tools

Rate This Thread

Display

New proteine

LMAO

10,000,000,000 on Feb.5

Posting Permissions