Next test - Beta 5

**Brian the Fist** · 03-25-2003, 02:12 PM

I've updated the beta files again, they are in the same place. Aside from fixing a few bugs you've managed to still find, I've modified the algorithms as per some of our discussions from beta 4. (Let me know if the CPU is still being taken up during energy minimization like it was for some people before) It should now complete 250 generations in under a week on most machines but we'll have to try to find out. We may make it give up when it gets stuck even faster depending on how this goes. Also, AMD_is_logical's trick will no longer work..

Lastly we've changed it a bit in an attempt to keep helices from unfolding which we experience with beta 4. The fix applied may or may not work so if this fails we have a backup plan (for test # 6) which will almost certainly do the job. This is what was preventing us from getting too close to the correct structure (below 5A).

We intend to run beta 5 for about 1 week and then try the other method for keeping helices after that for another week or so. We are nearing the final release of the new approach however, just a few more tests to tweak the algorithm before we unleash it. Thanks again to all the beta testers for your input and help.

I am going to wipe the beta stats shortly, so please do not start running the new beta until you see the stats have been reset on the beta web site (or else you'll have to start over again anyways).

**m0ti** · 03-25-2003, 03:40 PM

Is the link up?

I don't seem to be having any success downloading it!

**Welnic** · 03-25-2003, 03:43 PM

Originally posted by m0ti
Is the link up?

I don't seem to be having any success downloading it!

I had no trouble grabbing the linux client.

**m0ti** · 03-25-2003, 03:48 PM

I've always had problems downloading from df.org for some reason.

Any chance of someone hosting the windows CLI at another site?

**Digital Parasite** · 03-25-2003, 03:53 PM

For those of you who don't want to dig:

Linux:
ftp://ftp.mshri.on.ca/pub/distribfol...ux-i386.tar.gz

Windows text client:
ftp://ftp.mshri.on.ca/pub/distribfol...beta-win9x.zip

Windows screensaver:
ftp://ftp.mshri.on.ca/pub/distribfol...beta-win9x.zip

**m0ti** · 03-25-2003, 04:03 PM

Just figured out what the problem was; for some reason my server stopped servicing ftp requests properly. Did a restart and it was all ok.

Hmm, maybe in the future they could also be accessible by http? Not that I expect special treatment or anything.

**arjanscholl** · 03-25-2003, 04:18 PM

Hello there, just a question for the maker of the DF gui, don't know if anyone else knows the answer. There are 3 nice bars in the latest beta version of the gui, with the words 'Structure laxness levels' above it. But what the hell do these bars represent?

**Digital Parasite** · 03-25-2003, 04:38 PM

Originally posted by arjanscholl
Hello there, just a question for the maker of the DF gui, don't know if anyone else knows the answer. There are 3 nice bars in the latest beta version of the gui, with the words 'Structure laxness levels' above it. But what the hell do these bars represent?

Suprisingly enough, they represent the laxness levels of the structure being built.

In the current beta, structures can get "stuck" more easily than in the previous algorithm of the client. If this happens too many times, the software is being too strict on how the protein is folded and the 3 parameters that control this are relaxed. The higher the bar graphs are in the GUI, the more relaxed the settings are for folding the structure. Howard, feel free to correct me if what I explained is not correct.

Jeff.

**bwkaz** · 03-25-2003, 06:29 PM

I don't know how much this will screw up dfGUI for Windows (Jeff, care to comment? if it screws your code up then it's probably not a good idea, since so many more people use the Windows version), but the Linux version could really use a feature with regard to the progress updates.

Just before starting energy minimization (which appears to be running at normal priority for me, on Linux at least), could you do one final write of the progress.txt (and filelist.txt might be a good idea too) file, using the same format as now, just something like:

building structure 50 generation 5
0 until next generation
x generations buffered
Best Energy so far: x.xxx

or similar?

The benchmark info is about all that would need this. Right now, seeing as the progress.txt file resets the structure count at each new generation (as opposed to before, where the only time it reset was when you stopped the client), my old benchmark code was very confused. The fix was ugly, though, and requires that I try to figure out when a generation is over. If I miss, the bench data goes negative.

Which wouldn't be a problem either, if the client kept on schedule with its progress.txt updates. I'm using the default (perhaps that's part of the problem?) -g value, and on half the generations, partway though, the client writes progress.txt after doing 3 or 4 structs instead of 5. Perhaps that's a bug, but I'd rather have dfGUI work regardless of what the user's -g setting was. A final write of progress.txt would accomplish that.

Obviously that'd be something for the next beta (or whatever).

**AMD_is_logical** · 03-25-2003, 10:50 PM

I checked this forum tonight and saw that beta5 was out. I then discovered that my nodes had been crunching with beta4a for several hours after the stats reset, and that work from two of the nodes was being accepted by the server. That's why I already have a gen 75 6.22A and a gen 40 6.68A structure on the stats page.

Sorry about that.

I've now removed all old work from my nodes and installed beta5.

**Digital Parasite** · 03-26-2003, 06:59 AM

I have been running beta5 over night and it sure is much faster. I am already at generation 41 on one CPU and 36 on the other. My average generation time is 20 minutes where it used to be over an hour for beta4.

bwkaz: Updating the progress.txt file/filelist.txt won't screw anything up in dfGUI. What I am doing is storing the current generation # (I also need that to for timing the previous and average generation times). When I read the progress.txt file, if the generation I read is different from the one I have stored in memory, I know it is a new generation and I can restart the benchmark.

Jeff.

**bwkaz** · 03-26-2003, 09:43 AM

Originally posted by Digital Parasite
I have been running beta5 over night and it sure is much faster. I am already at generation 41 on one CPU and 36 on the other. My average generation time is 20 minutes where it used to be over an hour for beta4.

Yes, decidedly faster. I'm at gen 36 now, where after a day of crunching beta 4, I was at like gen 3 on the same machine. Although it was sharing the CPU with the non-beta client at the time (and isn't now), so that probably has a bit to do with it.

bwkaz: Updating the progress.txt file/filelist.txt won't screw anything up in dfGUI. What I am doing is storing the current generation # (I also need that to for timing the previous and average generation times). When I read the progress.txt file, if the generation I read is different from the one I have stored in memory, I know it is a new generation and I can restart the benchmark.

OK, cool. Now that I think about it, I'm going to have to do the same thing anyway, so that I can get the generation time as well (haven't had time to work on getting the latest Windows features ported over -- the vertical progress bars are taking all of it, since Qt has no such thing as a "vertical progress bar").

So there is a fallback if for whatever reason it's decided that another write of progress/filelist is a bad idea. Good.

**Brian the Fist** · 03-26-2003, 12:11 PM

You can now make your own plots, like those computed for the top 10 on the beta. I have posted the software package 'AnalyzeMovie'. Go to http://bioinfo.mshri.on.ca/trades/ and scroll down near the bottom to get it. Read the enclosed readme for details on how it works and how to use it. You can log in and download your 'best movie' and then run this program on it to generate the graphs and so on for it. Be warned that for a big movie (250 generations) it may use a fair bit of RAM and make take five minutes or more if you've got a slow (say 500 MHz) computer!

Have fun!

**Digital Parasite** · 03-26-2003, 12:47 PM

Hi Howard,

I tried the Analyze Movie (Win2k) program but I couldn't get it to work.

I downloaded my best structure movie and the native structure and use this as my command line:
analyzeMovie -f besstruct.val -g native.val

But it doesn't load and I get this error:
[NULL_Caption] FATAL ERROR: [067:001] FindPath failed in LoadDict
Hit Return

Then a dialog box opens and says "Abrupt: code = 1".

Any idea what I am doing wrong? The other options appear to be optional and have default settings.

Jeff.

**shortfinal** · 03-26-2003, 01:02 PM

Howard,

I downloaded the Windows Beta5 text client and tried it watching the thread priorities w/ TaskInfo. The priorities are now correct during minimizining/Trajectory Dist. Also tried changing the priority (-p option) and the thread priorities changed accordingly.

Shortfinal

**Guff®** · 03-26-2003, 07:21 PM

This one is looking really good.
I have no problems with it on any systems so far, fingers crossed.
The "Stuck-O-Meter" is a nice addition, allowing users to see that it's trying to work itself out of a jam.
As we say while standing around the grill for the steaks to cook, "Mine's done!"

**Brian the Roman** · 03-27-2003, 06:41 AM

Howard;
I understand that once we go live we will likely use crease energy to choose the best conformation of each gen. I was thinking it would be interesting to do this but still calculate the rmsd (when possible) and graph it. That way we'd get a better understanding of the relationship between them.

ms

**Digital Parasite** · 03-27-2003, 07:22 AM

It is interesting to see how different two clients act because of the random sampling at the beginning and how the folds proceed. With beta5, I started two clients from scratch on my Dual MP-2600+ machine, both have been running for the exact same amount of time and now after 1.5 days one is on generation 122 and has a best RMSD of 7.118 and the other is only at generation 92 but has a best RMSD of 6.389.

Jeff.

**Brian the Roman** · 03-27-2003, 07:38 AM

Howard;
question on how the list of the top 10 best structures works. It used to be only one entry per userid would ever show up, and then, I understand, you changed it to show all the best no matter who did them. I can see that that is working somewhat since I can see Guff has multiple entries in the top 10. However, I currently have one entry in the top 10 at 5.96. An hour ago I was also in the top 10 but it was showing my earlier 6.02 fold. Why isn't that fold showing up too, when the worst structure in the top 10 now is 6.11?

ms

**Digital Parasite** · 03-27-2003, 07:57 AM

Don't forget everyone, if you want a version of dfGUI that works with the current DF beta client, you can download it from here:
http://gilchrist.ca/jeff/dfGUI/dfGUIv22beta.zip

It is still v2.2beta2 so if you already have that version, there is nothing new, I'm just reposting the link in this beta5 thread.

Jeff.

**pointwood** · 03-27-2003, 08:24 AM

I can't download either

EDIT: now it works.

**Brian the Fist** · 03-27-2003, 10:31 AM

Originally posted by Digital Parasite
Hi Howard,

I tried the Analyze Movie (Win2k) program but I couldn't get it to work.

I downloaded my best structure movie and the native structure and use this as my command line:
analyzeMovie -f besstruct.val -g native.val

But it doesn't load and I get this error:
[NULL_Caption] FATAL ERROR: [067:001] FindPath failed in LoadDict
Hit Return

Then a dialog box opens and says "Abrupt: code = 1".

Any idea what I am doing wrong? The other options appear to be optional and have default settings.

Jeff.

Do you have 'bstdt.val' in the same directory as the executable, and are you running it from a DOS prompt in the directory you unzipped it to?

**Digital Parasite** · 03-27-2003, 10:35 AM

Originally posted by Brian the Fist
Do you have 'bstdt.val' in the same directory as the executable, and are you running it from a DOS prompt in the directory you unzipped it to?

Yes, and yes. All the files (including the beststruct.val and native.val) are in the same directory and I am running it from a DOS prompt in that directory.

Jeff.

**Brian the Fist** · 03-27-2003, 10:39 AM

Originally posted by Brian the Roman
Howard;
question on how the list of the top 10 best structures works. It used to be only one entry per userid would ever show up, and then, I understand, you changed it to show all the best no matter who did them. I can see that that is working somewhat since I can see Guff has multiple entries in the top 10. However, I currently have one entry in the top 10 at 5.96. An hour ago I was also in the top 10 but it was showing my earlier 6.02 fold. Why isn't that fold showing up too, when the worst structure in the top 10 now is 6.11?

ms

The stats look right to me. Remember they are only updated once per hour now while when you log-in it is real-time. I see you on the top ten now with a 5.86...

**Brian the Fist** · 03-27-2003, 11:27 AM

Originally posted by Digital Parasite
Yes, and yes. All the files (including the beststruct.val and native.val) are in the same directory and I am running it from a DOS prompt in that directory.

Jeff.

I see now it is a bug in the NCBI toolkit. Anyways, the easiest way to fix it is to run it with '.\analyzeMovie etc. etc.' - put the .\ in front of the command. I'll add this to the documentation until the bug is more properly fixed.

**Digital Parasite** · 03-27-2003, 11:58 AM

Originally posted by Brian the Fist
I see now it is a bug in the NCBI toolkit. Anyways, the easiest way to fix it is to run it with '.\analyzeMovie etc. etc.' - put the .\ in front of the command. I'll add this to the documentation until the bug is more properly fixed.

Everything seems to be working fine when I use '.\\' in front. Thanks.

Jeff.

**Brian the Roman** · 03-27-2003, 03:18 PM

Originally posted by Brian the Fist
The stats look right to me. Remember they are only updated once per hour now while when you log-in it is real-time. I see you on the top ten now with a 5.86...

The point that I was trying to make is that one of my top 10 entries disappeared after a better one was added even though the original entry was still in the top 10 overall. I didn't think it should disappear.

As it happens, however, I now have 2 in the top ten but they're from different clients. It looks to me like only one entry per client is logged in the top 10 even if a single client generated many that should be in the list. I didn't think that you could tell the clients appart...

ms

**AMD_is_logical** · 03-27-2003, 04:54 PM

Originally posted by Brian the Roman
As it happens, however, I now have 2 in the top ten but they're from different clients. It looks to me like only one entry per client is logged in the top 10 even if a single client generated many that should be in the list. I didn't think that you could tell the clients appart...

A particular set of generations gets only one entry. Your client would need to finish all 250 generations and start a new set before it could get another entry.

**pointwood** · 03-28-2003, 07:37 AM

I haven't followed the beta tests that close so I appologize if this has been discussed before. From progress.txt - Is this normal:

Building structure 1 generation 10
49 until next generation
0 generations buffered
Best Energy so far: 10000000.000

It seems like nothing new is happening.

**Brian the Roman** · 03-28-2003, 08:19 AM

Looks to me like you're simply currently working on the first structure of the generation. Wait a bit and it will probably have moved on.

ms

**Digital Parasite** · 03-28-2003, 09:09 AM

Originally posted by pointwood
It seems like nothing new is happening.

I find that the start of each new generation seems to take a lot of time before it get to the second structure but once it gets to the second, the rest of the structures in that generation go fairly quickly.

Jeff.

**AMD_is_logical** · 03-28-2003, 02:00 PM

Although others are crunching faster, I find that I am crunching a LOT slower than when I was able to use my clock trick. Also, back then, I had half a dozen structures below 5.2A from just 4 nodes. I'm nowhere near that now.

The 5A wall looks as solid as ever.

Secondary structure still seems to evaporate. Perhaps the use of RMSD for scoring is actually hostile to such structure. I think we will need to test it using energy as a scoring function before we can evaluate how well secondary structure is being encouraged by the beta5 changes.

**tpdooley** · 03-29-2003, 08:11 AM

In a little over a week with the last beta, our best score was 4.95A.. and we're already down to 5.10A.
If 4 machines running at 30-60x the speed of the opposition couldn't get a better score than the 40 other participants in the beta testing.. I'd think it wasn't the route to take...

**Georgina** · 03-29-2003, 06:04 PM

Originally posted by tpdooley
In a little over a week with the last beta, our best score was 4.95A.. and we're already down to 5.10A.

Make that 4.86

G

**arjanscholl** · 03-30-2003, 05:31 AM

Originally posted by Georgina
Make that 4.86

G

or a 4.62

**bwkaz** · 03-30-2003, 08:20 AM

I see a 4.51, myself...

So much for that wall at 5, I guess.

**Brian the Roman** · 03-30-2003, 10:04 AM

I hope I'm not beating a dead horse here, but...

I'm finding that some generations or conformations take MUCH longer than others. I haven't watched long enough to determine if it is every conformation in a specific gen or if it's only a couple.

When I noticed that I didn't seem to be making progress for several hours I went and looked at the client. I saw the in a tight spot message and it went up to just over 200K conformations. Then it went back to normal work but immediately did the tight spot routine again. This happened about 6 or 7 times while I was watching (about 15 or 20 minutes). During this entire period the residue # it was trying to place stayed in the 69 to 72 range, so it's not like it was making its way through. To spend 1/2 an hour on a single conformation seems a bit excessive, particularly since the end result was significantly worse than my best so far anyway...

Does the effort limiting process still need a bit of tweaking?

ms

**Mikus** · 03-31-2003, 09:24 AM

Originally posted by Brian the Roman
Does the effort limiting process still need a bit of tweaking?

Looked at the display last evening, then again this morning -- the client seemed to not have gotten much further.

So looked at the timestamps of the accumulated .bz2 files, and saw that there was one generation that had taken 7 hours 51 minutes !! (This is on an AMD machine running around 900+ MHz)

mikus

**Brian the Roman** · 03-31-2003, 09:51 AM

One of my clients, an Athlon XP1900 is averaging about 4 hours per generation. By my calculations that indicates 250 gens will take over 41 days.

ms

**Digital Parasite** · 03-31-2003, 11:05 AM

The speed of the client seems to vary widely. I started two beta5 clients at the same time, each on an AMD MP-2600+ processor. They both were active for the same amount of time but one of them finished all 250 generations on Saturday and the other one is still on generation 240 and is running much slower.

The one that is already finished, generated a respectable 5.24 RMSD but the one that is take much longer to run is still in the 6.x range.

Jeff.

Thread: Next test - Beta 5

Thread Tools

Rate This Thread

Display

Next test - Beta 5

Posting Permissions