PDA

View Full Version : my dual Athlon MP is locking up



muttley
10-17-2002, 12:01 PM
I have 2 instances I run manually and it appears that the second instance is causing lockup. By lockup I mean the computer running XP pro freezes completely, no mouse keyboard etc.

I have had instance 1 running for say 30 minutes and no problem then I start instance 2 say 5 to 10 minutes and lockup. I have 2 MP 1800's tunning on a Tyan Tiger with 1 gig registered memory.

What do I need to fix things up???

Here is the second instance error code. (partial, prior was running fine)(towards the bottom you see a file is the wrong size error)



========================[ Sep 29, 2002 2:20 PM ]========================

========================[ Oct 4, 2002 6:09 AM ]========================
!!! Strand column overwritten !!!

========================[ Oct 4, 2002 6:23 PM ]========================
ERROR: [010.003] {taskapi.c, line 1199} [ReadServerResponse] Timeout waiting for response, got 0 chars.
ERROR: [000.000] {foldtrajlite.c, line 4734} Error during upload: NO RESPONSE FROM SERVER - WILL TRY AGAIN LATER
ERROR: [777.000] {ncbi_socket.c, line 989} [SOCK::s_Connect] Failed pending connect to www.distributedfolding.org:80 (Unknown)
ERROR: [777.000] {ncbi_connutil.c, line 640} [URL_Connect] Socket connect to www.distributedfolding.org:80 failed: Unknown
ERROR: [777.000] {ncbi_socket.c, line 921} [SOCK::s_Connect] Failed SOCK_gethostbyname(www.distributedfolding.org)
ERROR: [777.000] {ncbi_connutil.c, line 640} [URL_Connect] Socket connect to www.distributedfolding.org:80 failed: Unknown
ERROR: [777.000] {ncbi_socket.c, line 921} [SOCK::s_Connect] Failed SOCK_gethostbyname(www.distributedfolding.org)
ERROR: [777.000] {ncbi_connutil.c, line 640} [URL_Connect] Socket connect to www.distributedfolding.org:80 failed: Unknown
ERROR: [000.000] {foldtrajlite.c, line 4734} Error during upload: NO RESPONSE FROM SERVER - WILL TRY AGAIN LATER
ERROR: [777.000] {ncbi_socket.c, line 921} [SOCK::s_Connect] Failed SOCK_gethostbyname(www.distributedfolding.org)
ERROR: [777.000] {ncbi_connutil.c, line 640} [URL_Connect] Socket connect to www.distributedfolding.org:80 failed: Unknown
ERROR: [777.000] {ncbi_socket.c, line 921} [SOCK::s_Connect] Failed SOCK_gethostbyname(bioinfo.mshri.on.ca)
ERROR: [777.000] {ncbi_connutil.c, line 640} [URL_Connect] Socket connect to bioinfo.mshri.on.ca:80 failed: Unknown
ERROR: [000.000] {foldtrajlite.c, line 1157} Unable to check server status

========================[ Oct 5, 2002 7:20 AM ]========================

========================[ Oct 15, 2002 11:33 PM ]========================
ERROR: [000.000] {foldtrajlite.c, line 4734} Error during upload: NO RESPONSE FROM SERVER - WILL TRY AGAIN LATER
ERROR: [000.000] {foldtrajlite.c, line 4734} Error during upload: NO RESPONSE FROM SERVER - WILL TRY AGAIN LATER
ERROR: [000.000] {foldtrajlite.c, line 4734} Error during upload: NO RESPONSE FROM SERVER - WILL TRY AGAIN LATER
ERROR: [000.000] {foldtrajlite.c, line 4734} Error during upload: STATUS 905 USER HANDLE NOT FOUND
ERROR: [010.003] {taskapi.c, line 1199} [ReadServerResponse] Timeout waiting for response, got 1227860 chars.
ERROR: [000.000] {foldtrajlite.c, line 1584} Retrieved update file /pub/distribfold/download/patch/distribfold-patch-110-win9x.exe from ftp.mshri.on.ca but length was 1227549, should be 4436143

========================[ Oct 16, 2002 3:55 PM ]========================
ERROR: [000.000] {foldtrajlite.c, line 5672} Error during upload: STATUS 905 USER HANDLE NOT FOUND

========================[ Oct 16, 2002 11:56 PM ]========================

========================[ Oct 17, 2002 1:29 AM ]========================

========================[ Oct 17, 2002 11:35 AM ]========================

========================[ Oct 17, 2002 11:42 AM ]========================


+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

First instance partial log


========================[ Oct 4, 2002 6:09 AM ]========================

========================[ Oct 4, 2002 6:23 PM ]========================
ERROR: [777.000] {ncbi_socket.c, line 921} [SOCK::s_Connect] Failed SOCK_gethostbyname(www.distributedfolding.org)
ERROR: [777.000] {ncbi_connutil.c, line 640} [URL_Connect] Socket connect to www.distributedfolding.org:80 failed: Unknown
ERROR: [777.000] {ncbi_socket.c, line 921} [SOCK::s_Connect] Failed SOCK_gethostbyname(www.distributedfolding.org)
ERROR: [777.000] {ncbi_connutil.c, line 640} [URL_Connect] Socket connect to www.distributedfolding.org:80 failed: Unknown
ERROR: [000.000] {foldtrajlite.c, line 4734} Error during upload: NO RESPONSE FROM SERVER - WILL TRY AGAIN LATER
ERROR: [777.000] {ncbi_socket.c, line 921} [SOCK::s_Connect] Failed SOCK_gethostbyname(www.distributedfolding.org)
ERROR: [777.000] {ncbi_connutil.c, line 640} [URL_Connect] Socket connect to www.distributedfolding.org:80 failed: Unknown
ERROR: [777.000] {ncbi_socket.c, line 921} [SOCK::s_Connect] Failed SOCK_gethostbyname(bioinfo.mshri.on.ca)
ERROR: [777.000] {ncbi_connutil.c, line 640} [URL_Connect] Socket connect to bioinfo.mshri.on.ca:80 failed: Unknown
ERROR: [000.000] {foldtrajlite.c, line 1157} Unable to check server status

========================[ Oct 5, 2002 7:20 AM ]========================

========================[ Oct 15, 2002 11:32 PM ]========================
ERROR: [000.000] {foldtrajlite.c, line 4734} Error during upload: NO RESPONSE FROM SERVER - WILL TRY AGAIN LATER
ERROR: [000.000] {foldtrajlite.c, line 4734} Error during upload: NO RESPONSE FROM SERVER - WILL TRY AGAIN LATER
ERROR: [000.000] {foldtrajlite.c, line 4734} Error during upload: NO RESPONSE FROM SERVER - WILL TRY AGAIN LATER
ERROR: [000.000] {foldtrajlite.c, line 4734} Error during upload: STATUS 905 USER HANDLE NOT FOUND

========================[ Oct 16, 2002 3:35 PM ]========================
ERROR: [000.000] {foldtrajlite.c, line 5672} Error during upload: STATUS 905 USER HANDLE NOT FOUND
ERROR: [001.001] {foldtrajlite.c, line 5815} StartServiceCtrlDispatcher failed with error 1063

========================[ Oct 16, 2002 11:56 PM ]========================

========================[ Oct 16, 2002 11:56 PM ]========================

========================[ Oct 17, 2002 1:29 AM ]========================

========================[ Oct 17, 2002 10:55 AM ]========================

Brian the Fist
10-17-2002, 03:09 PM
Are you running them as services or just teh command line? Have you installed 2 complete copies of the program in different directories? Try NOT running two instances for a couple days and see if your computer still locks up. Its very unlikely such a problem would be caused by it.

muttley
10-17-2002, 11:34 PM
Will do.

I run them from explore.exe and open the foldit.bat file.

And they are in different directories.

The only program I have added to the system has been the Tyan motherboard monitor program.

muttley

mighty
10-18-2002, 10:14 AM
Could it be a conflict in the temp directory? Does each instance have its own temp dir?

/Ole

muttley
10-18-2002, 10:26 AM
Originally posted by mighty
Could it be a conflict in the temp directory? Does each instance have its own temp dir?

/Ole


On the drive they are
foldit/distribdfold/
and
folditTWO/distribfold/

as for any further comments on my part I don't know about a temp directory with the -rt option engaged.

I was running the last fold and did seti for 10 days and came back after fold change and then it would lock up. I have been running fine it seems running seti again. I have reseated the ram and drive cables. And ran virus scan and hard drive check and defrag. In a day or 2 I will run the program again and see if the problem continues, I still wonder about that file that was too long. If the problem persists I will zip the folders and open new instances and try again and see what happens with new folders.

muttley

vsemaska
10-18-2002, 11:23 AM
I've run up to 4 instances using the same temp directory without any problems.

Vic

muttley
11-05-2002, 05:34 AM
Sorry about the delay, geting back to this.

I zipped the files and one said 40 files and the second said 39 files were zipped.

I reinstalled the program and started 2 seperate instances from the command line.
Running the program for a couple of hours so far has caused no lockups.

muttley

muttley
11-13-2002, 02:35 PM
There is something causing the problem with Service Pack 1 from Microsoft I have no idea but I have run over 48 hours and had no system freezes.
Something with this system setup and configuration and a Tyan Tiger -4m version MB.

I had time to thourghly evaluate the system and SP1 is the problem. What in SP1 I don't know and I don't believe that I have the resources to.
The only option I didn't try is turning off the -rt switch but other than that I have done everythign else. At the present time I have only a GForce 4400 by leadtek, floppy, a 40 gig hard drive by seagate and a r/rw/cd-r/drive. 550 watt enermax power supply and registered memory that I swapped with some other that had no effect, increasing up to 1.5 gig of memory.
One thing else could be the onboard lan could be a culpert. Sound is by USBand was disconnected at time of testing.
MB BIOS was the latest.

muttley

Halon50
11-13-2002, 08:44 PM
muttley, I hate to get off-topic, but I have a Tyan 2466N-4M board with two MP 2200+ cpus in it, and they each seem to get a similar performance as my Duron-1.1GHz cpu. I have another system with an XP-2200+ that seems to run 40% faster than a process run on one of the MP-2200+ cpus, with the same OS, memory (well the board has registered ECC memory in it, but the performance hit from ECC definitely doesn't warrant such a speed differential).

Are there some BIOS tweaks that could eke out some additional performance from these CPUs? I think I've been through the entire list of CMOS settings, and I didn't see anything special.

If it helps any, I recently got the system running on WinXP SP1 just fine, with no network hiccups as far as I could tell. At the time, I had an MSI GF4600 in the AGP slot, and an SB Live in the lowest PCI slot.

Brian the Fist
11-13-2002, 11:00 PM
If you still have trouble, please post a screenshot of the error, if possible, or else, give a complete description of the problem - exactly how you started it, what flags you set in foldit.bat, how many you run at a time and in how many different directories, and how long does it run before it 'dies' - variable amounts, always at the same point? Could another program be crashing your machine instead or are you certain it is the client? Are our CPUs or RAM overclocked?

muttley
11-14-2002, 04:12 AM
I'll get back to this as time permits.
I'll use msconfig to turn off the services and startup (after reloading SP1) to see if some software is causing the problem as you alluded to a program conflict. (See note at bottom about mouse and I'll remove and reload drivers also for them.)

Nothing is overclocked.

By freeze or lockup I mean total ... mouse freezes, screen freezes, windows task manager freezes and the graph doesn't move, not even an error messsage to be sent to microsoft.

To try and isolate and get folding to run I inserted the Windows XP pro CD and told it to install and do an upgrade (by upgrade I refer to video drivers etc) so I wouldn't lose my inforrmation. Then I went and would run folding for a day or two and no lockups. Then I went and started adding stuff from windows update (I don't remember what.) Then when I added SP1 the problem started all over again.
I removed SP1 and the problem quit. Right now I am running without SP1 but I do have a few patches after SP1 that I added that said if I removed them some specific programs might not work so I didn't remove those updates (I was rather confident at the time that reloading windows had solved the problem so I was not as methodical) but that is when I removed SP1 and the problem went away.

An example of a repeatable time when the computer would freeze is on a mouse click playing solitare. I tried differen't mice one being a usb optical Logitech (going through a hub) and another time being a PS2 mouse generic (both mice having the scroll feature.)

Till next time when I load SP1 and disable startup and service programs.

Bruce Gaylord (Gaillard)

TheOtherPhil
11-14-2002, 05:11 AM
I had a similar problem on my current workstation (dual XP2400's) and WinXP SP1. Both clients crashed after about 1 minute for no apparent reason with nothing left in the logs. I did a fresh install with the same results. I put it down to the current protein and I just removed the client and went back to G@H.

muttley
11-14-2002, 07:47 AM
Your Tyan motherboard is that the -4m that has the corrected chipset so that the USB ports work, not like the prior version that had a daughter card and gave you USB 2 to make up for the problem?
If the above don't work what I am considering trying is disabling the onboard lan and inserting a lan card and see if that is the problem. When I reinstalled/updated on top of the OS it said the lan might not work.
Or possibly disable the USB.

muttley

TheOtherPhil
11-14-2002, 08:16 AM
The clients that crashed were ran on my Iwill MPX2. The Tyan S2466N I own is the older with the on-board USB disabled.

Khan
01-12-2003, 03:09 AM
I just recently had the same problem as you two on my own IWill MPX2 w/ a pair of 2400s.. I had shortcuts to two different foldit bat's in my startup folder, and at first it would crash after a half hour or so, then after a few minutes. The whole system. A freeze, out of the blue -- had never had XP do that to me before, not without an error report or blue screen. Eventually I started going off trying to blame it on my HD, RAM, and even took my CPU's off to check that I did a decent job of applying the thermal compound and that my thermal probe was on good. (No chance of this being a CPU problem -- water cooled and stable, no OC.. yet) After a couple reboots of changing settings and chasing my tail, the system started to freeze instantly upon loading, it'd usually get to the point where the two clients popped up and said.. handel found, blah blah, wait five seconds. And then freeze. Couldn't close anything in time, and the only thing that kept working in the mouse, not that I could click anything. Since it instantly froze, I tried repairing windows from the CD, but that didn't work since it must not wipe out your startup folder.. ended up reinstalling a fresh copy of Windows XP Pro. :bang: I'm ticked off its a client problem, and I apparently wont be able to capitalize on my dualy setup, but relieved I didn't screw up a $240 processor or $200 mobo. :D
Hope there's a patch for this soon, and until there is.. back to F@H for me.

Brian the Fist
01-12-2003, 12:28 PM
It is extremely unlikely that our client is causing the problems you describe. However, if you stop using the client and the problem goes away, I guess you had best not run it.

TheOtherPhil
01-12-2003, 12:52 PM
The problems I had on the last protein have gone away with the current protein and DF is running great on my system again.

Brian the Fist
01-13-2003, 10:29 AM
Well that makes no sense as essentially nothing changed (other that erasing that horrible out-of-control log file problem).