Results 1 to 18 of 18

Thread: Nov 20th Client - FATAL ERROR: [013.000]

  1. #1
    Boinc'ing away
    Join Date
    Aug 2002
    Location
    London, UK
    Posts
    982

    Nov 20th Client - FATAL ERROR: [013.000]

    I upgraded my clients to the new memory-leak free client a couple of days ago and have noticed that it likes to die quietly - the error I get in the error.log is:

    FATAL ERROR: [013.000] {foldtrajlite2.c, line 5719} Cannot open file .\progress.txt - disk may be out of space
    This has happened on a PC with 28GB free space on the disk DF is running on, on another with 200GB free space and the 3rd one has 50GB free space...

    At first I thought it might be to do with dfGUI / dfMon interfering with DF writing/reading the progress.txt - stopping both from running/monitoring hasn't removed the problem...

    I have noticed that all the times the client died, dfGUI was reporting 100% on all 3 laxness levels...probably a coincidence...

    Also, it appeared that only my Win2K boxen was dying - but it's happened a couple of times today on my XP boxes as well...

    Any ideas on why the client is dying?

    /edit - dropped back to the Nov 03 client - memory leak is better than not running at all...
    Last edited by pfb; 11-25-2003 at 04:51 PM.

  2. #2
    At least we have more info on the error now. The 13 is EACCES. From MSVC docs:

    ----------------
    EACCES

    Permission denied. The file's permission setting does not allow the specified access. This error signifies that an attempt was made to access a file (or, in some cases, a directory) in a way that is incompatible with the file's attributes.

    For example, the error can occur when an attempt is made to read from a file that is not open, to open an existing read-only file for writing, or to open a directory instead of a file. Under MS-DOS operating system versions 3.0 and later, EACCES may also indicate a locking or sharing violation.

    The error can also occur in an attempt to rename a file or directory or to remove an existing directory.
    -------------------------------------

    This is most likely a sharing violation. Is ANYTHING reading progress.txt on your system, perhaps?
    Howard Feldman

  3. #3
    perhaps dfGui?
    Team Anandtech DF!

  4. #4
    Originally posted by Brian the Fist
    This is most likely a sharing violation. Is ANYTHING reading progress.txt on your system, perhaps?
    Based on the error, the most logical thing would be to suspect *something* had the file open. Is it also possible that the DF client didn't close the file the previous time properly so it has its own file still open when the new call came to write to the file?

    Whatever the reason though, since the progress.txt file isn't mandatory for the DF client to continue working, would it not be better if the DF client just noted the error in error.log and continue crunching instead of exiting at this point? The next time the method is called to update progress.txt it could try again and might succeed.

    Jeff.

  5. #5
    Originally posted by m0ti
    perhaps dfGui?
    Well he did say that:
    "At first I thought it might be to do with dfGUI / dfMon interfering with DF writing/reading the progress.txt - stopping both from running/monitoring hasn't removed the problem... "

    Jeff.

  6. #6
    Yup, we already changed this for the next release. Its very curious though - maybe a background virus scanner or something has the file open? who knows
    Howard Feldman

  7. #7
    Originally posted by Digital Parasite
    Well he did say that:
    "At first I thought it might be to do with dfGUI / dfMon interfering with DF writing/reading the progress.txt - stopping both from running/monitoring hasn't removed the problem... "

    Jeff.
    Hmm... I really should pay more attention.

    Sorry!
    Team Anandtech DF!

  8. #8
    Boinc'ing away
    Join Date
    Aug 2002
    Location
    London, UK
    Posts
    982
    Originally posted by Brian the Fist
    Yup, we already changed this for the next release. Its very curious though - maybe a background virus scanner or something has the file open? who knows
    On the machine it occurred most on is a lean Win2k setup - no virus checker runnning or other progs that could have the file open (unless some hidden MS prog had it open...)... as you say, who knows?

    As long as it's fixed now (and be available in the next update), I'm happy

  9. #9
    Boinc'ing away
    Join Date
    Aug 2002
    Location
    London, UK
    Posts
    982
    Getting this error again with the new update...

    Code:
    Thu Feb 12 07:50:12 2004 FATAL ERROR: [013.000] {foldtrajlite2.c, line 5707} Cannot open file .\progress.txt - disk may be out of space
    This was on a dual CPU PC running Windows 2000 - the other client didn't stop with this error and there is ~40GB free...

    Only monitoring happening is via dfMon - which is how I noticed the problem again...

    I'll keep an eye on it but thought I'd mention that the error is back

    /edit - just noticed another PC do it...it was at the end of a gen and trying to upload...that timed out and then the above error occured...

  10. #10
    Boinc'ing away
    Join Date
    Aug 2002
    Location
    London, UK
    Posts
    982
    and another one...this was with no external monitoring utils and have 48GB free on the drive:

    Code:
    Thu Feb 12 12:55:52 2004 FATAL ERROR: [022.000] {foldtrajlite2.c, line 5761} Cannot open file .\progress.txt - disk may be out of space
    This is starting to get a bit of a concern - 3 out of 8 clients have now stopped with this error...

  11. #11
    Senior Member
    Join Date
    Jul 2002
    Location
    Kodiak, Alaska
    Posts
    432
    With all the nasty worms going around lately - have you verified that these systems with the problems are still virii clean? (And do the win2k windows updates help eliminate the errors - if the system is clean of virii?) Just trying to help rule out a few possibilities that will help focus on the culprit.

    You have mentioned that the machines have died whether dfq was monitoring them or not, correct?

    And all the ones that failed are duals.. or are some of the failers single cpu machines?
    www.thegenomecollective.com
    Borging.. it's not just an addiction. It's...

  12. #12
    Boinc'ing away
    Join Date
    Aug 2002
    Location
    London, UK
    Posts
    982
    All PCs are clean - AVG is running and does a scan during the night...3 PCs have had this problem - 1 Win2000 and 2 XP Pro setups. The Win2000 is the duallie - the others are a P4 2.4 'B' and an AMD64 3200+

    Looking at it further - 2 clients had the problem when they couldn't upload a gen (and had the 013.000 error) and the other just stopped mid-gen (with an 022.000 error)...

  13. #13
    Senior Member
    Join Date
    Mar 2002
    Location
    MI, U.S.
    Posts
    697
    The 013.000 error is "permission denied" (at least, according to /usr/include/asm/errno.h on my Linux box). The other one is "invalid argument" (according to the same file).

    According to MS's documentation, when GetLastError() returns 13, that's an "invalid data" error, and error 22 is "the device does not recognize the command". I'm pretty sure the client reports the Unix errno value, though.

    FYI...
    "If you fail to adjust your notion of fairness to the reality of the Universe, you will probably not be happy."

    -- Originally posted by Paratima

  14. #14
    Yes, 13 here means access denied I believe. Most likely some other process on your system was accessing the file (virus scanner or fold monitor?)
    Howard Feldman

  15. #15
    Boinc'ing away
    Join Date
    Aug 2002
    Location
    London, UK
    Posts
    982
    I have had the error since then - it seems to have occurred at the same time I was having a few 'net connection issues....

    On one of the PCs that threw up the 13 error, no virus scanner or monitoring was happening - I only knewit had stopped Folding as the fan wasn't as loud as it normally is (it's a thermal-controlled one)...

    Ho hum - the joys of MS systems

  16. #16
    Member
    Join Date
    Apr 2003
    Location
    Germany
    Posts
    59
    @howard

    wouldnt it be possible if such an error occurs to wait 2 seconds (or one) and then try again to access this file? and only if then again the file can't be accessed bring up an errormessage?

  17. #17
    That's a great idea - we've already done this If you look, this thread is several months old now...
    Howard Feldman

  18. #18
    Member
    Join Date
    Apr 2003
    Location
    Germany
    Posts
    59

    Unhappy

    oK!
    my intention was just to help solving problems...


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •