PDA

View Full Version : Segmentation fault



Scoofy12
09-12-2002, 12:19 PM
Since the last update I've been trying to run DF on a cluster of linux boxen (the same one i've used many times before on this project) but when I start them up and come back later I find many of them are no longer running, leaving nothing in the error.log file. I run them in kind of a convoluted way with this script I have, so this time i just ran it using screen to call the foldit script, and sure enough, it crashed within about 15 minutes, with this error:

./foldit: line 26: 32055 Segmentation fault ./foldtrajlite -f protein -n native -qf -df -it -rt

I don't know if this has more to do with the system or the program; it's rock solid stable on my own (overclocked even) linux box, but I can't seem to keep them running reliably on these systems. They are running RH7.3 and I'm using the icc version. I'm about to try the gcc version and see if it's stable, but if anyone has any thoughts, i'd be glad for suggestions.

Scoofy12
09-12-2002, 01:13 PM
Update: same result with the gcc client.
./foldit: line 26: 386 Segmentation fault ./foldtrajlite -f protein -n native -qf -df -it -rt

I forgot to mention that this is while running 2 processes on a dual cpu machine (from different dir's of course). the 2nd one sometimes seems to live a lot longer than the first. (the 2nd from my earlier experiment is still happy, but both of them from this experiment have crashed, I assume the same way.)

Paratima
09-12-2002, 11:35 PM
I hereby offer to buy you a virtual beer to commiserate. I've got 2 Linux boxen happy as clams and one that gets Signal 11 errors, which is segmentation fault. Haven't had the time to track it down on the one box. Switching to the gcc compile reduced frequency of error, but didn't fix it. Currently running that box under-clocked!

Check out the thread Signal 11 Error (http://www.free-dc.org/forum/showthread.php3?s=&threadid=1389), where you can learn more about segmentation faults than mere mortals were meant to know. :rolleyes:

Oh, and be prepared to be told that it's a memory problem. :rolleyes:

And if you find out what it really is, PLEASE tell me! (Even if it's RAM. ;) )

Scoofy12
09-13-2002, 01:25 AM
I doubt that it's ram, for the following reasons:
a) it works on my overclocked box with cheap ram, and these (25 computers) are Dell boxes running at normal spec, and in normal daily use by normal daily users who have no problems.
b) (and more importantly) This is a problem I encountered many moons ago before CASP season with 1 or 2 older versions of the client. A subsequent update made all the problems go away, and indeed even last week's version had no problems running on these machines. This one does, apparently.

Brian the Fist
09-13-2002, 10:38 AM
Since it seems to be only seg faulting on your machine(s) it is not likely a problem with the software. Are any disk partitions full? If you continue to receive problems, I can send you a debug version so you can get me a core dump/stack trace to see exactly where it is crashing and that might give a clue as to what is going on. Let me know if you want to try this.

Paratima
10-06-2002, 08:46 AM
Not RAM. Not the program. :cool:

It was the one thing I ALWAYS forget about.

Oh, embarrassed me. :rolleyes:

It was the stinkin' ROM BIOS !

Howard, put this on yer list of Brian's Advice to Distraught Folders:

"Is your motherboard's BIOS up-to-date"?

All boxen now running at full speed (well, actually a teeny bit faster than that, even). :D

IronBits
10-06-2002, 09:59 AM
Good job Paratima! :cheers: