PDA

View Full Version : data cksum failed



QIbHom
07-24-2003, 10:02 AM
I'm getting this in my error log:

ERROR: [000.000] {foldtrajlite2.c, line 4484} File ./8geejxka_5_8geejxka_protein_78_0000020_min.val is corrupt, missing or has been tampered with; cannot continue - replace file and start again, or manually delete filelist.txt
ERROR: [000.000] {foldtrajlite2.c, line 4596} Error during upload: Data file checksum failed

Is there any way to recover from this that doesn't involve deleting my filelist.txt, and loosing 18 generations?

Running the linux command line client.

Thanks

Brian the Fist
07-24-2003, 10:49 AM
Could you please tar up and gzip your entire directory and send it to trades@mshri.on.ca?
This is a bug which has surface in the previous version of the client which we were unable to reproduce. Can you also tell us, has it run continuously since gen. 0, or have you stopped and started a number of times? If you have stopped it, do you recall if it was ever between generations or not? All this info will help us locate and reproduce the bug. Thanks.

QIbHom
07-24-2003, 10:58 AM
Yep, I'll tar and gzip it up, as sure as I check my syntax.

It has run continuously since gen 0. It autoupdated just fine, and ran until I noticed this morning that I had generations queued.

Thanks!

QIbHom
07-24-2003, 11:11 AM
Correction. It has *not* run continuously.

I rebooted last night, around 1 am EST. Before doing so, I did a rm *.lock and waited a few.

Sorry about that. The coffee is just starting to kick in.

Brian the Fist
07-24-2003, 02:26 PM
I had a look at your files. If you look at the error.log, at 1:18 AM (when you restarted it) it gave a sig 11 (which mean seg. fault, program crash). Then you restarted it again at 1:21 AM and started getting the error. Looking at the named file, I can see it is missing the checksum that should be on it. It appears the client crashed just as it was about to checksum the file.

So do you remember starting it and then it crashing and starting it again 3 minutes later? Any idea why it may have crashed that one time?

QIbHom
07-24-2003, 07:45 PM
Nope. Just remember starting it once. It is possible that, after starting up my normal bunch of apps, I checked and restarted it. Don't remember doing that.

Thanks.

Brian the Fist
07-25-2003, 10:41 AM
do you have it set to start automatically on boot or do you do it manually? I really need to know why it crashed as I have not seen this before and it is a good clue to the problem.

QIbHom
07-25-2003, 11:00 AM
I start it manually after boot, but I'm also using Dyrryth's script to stop and start it every 24 hours. Here it is, cut and pasted from another post around here somewhere:

Does anyone else have more machines they can borg/build? For
those of you running Linux/Unix out there, the following
cron entry has worked out pretty well for me:

0 */24 * * * rm /usr/local/distribfold/foldtrajlite.lock;
sleep 30; cd /usr/local/distribfold; ./foldit

I've modded the foldit script to run this command line:

nohup ./foldtrajlite -f protein -n native -qt -rt -it >
/dev/null &

Which essentially runs the process in the background. The
cron job restarts the client every 24 hours. So far, I
haven't seen a client grow beyond 200mb in the first 24 hours.

***

I'm also using these on another linux box, which hasn't had problems yet. It also hasn't been running them for long.

Should I delete the directory, and start from scratch? I've been crunching on this comp, but, of course, can't upload anything.

Thanks!

Brian the Fist
07-25-2003, 06:31 PM
The error you received requires you to delete fielist.txt to get started again. Im still very curious about the sig11 crash though since the log says you started it twice 3 minutes apart. Well if you remember anything else let me know, otherwise Ill have to wait for the next person to have it happen.

yellowfin
08-15-2003, 09:59 PM
Yes, I do encounter such error. I remember last nite I have a BSOD and the computer have to be restart.

This morning I have upload a few strutures and then the error message come out.

The error message occur after the DOS window hang when I try to upload a few more strutures and the DF GUI program hang.

So I am not sure which one of the above cause the error message.

I have to delete the whole folder and download a new program for it to start again. About 200 strutures gone. :cry:

PS. Using window 2000 Pro OS.

Brian the Fist
08-16-2003, 03:29 PM
You cannot get a sig 11 error on Windows...