tpdooley
01-22-2003, 04:39 PM
For close to 6 months I've had 1-12 machines with 24/7 internet connections (on dialup) folding away. I've had some of them lose power and shut down unexpectedly, I've turned at least one off, I've rebooted one while on the wrong keyboard, and I've had a few windows updates end the update by instantly rebooting the system.
In all those cases, I lose the information on the 10k structures being worked on. Booting back up, I restart DF, and it continues on (while I grumble about losing 200-9800 folds.. and those higher values do hurt a bit.. ;)
When folks are running /nonet (by choice or not i.e. the cable modem needed to be rebooted to pick up the latest change in dns info) - and the system is unexpectedly shut down, they get messages similar to:
FATAL ERROR: [000.000] {foldtrajlite.c, line 760} Illegal file found in upload list
and we have to edit filelist to get rid of the entries that DF is hanging on - to get most of the stored collection to upload.
can we get an official program aka "crashedtest.exe" - that would rebuild the filelist, insuring that the matching set of .log.bz2 and .val.bz2 exist, and that they're in valid form and hold valid data?
----------
For the new algorithm - where our current results are based on the results of the previous generation - can you make the client a little more resilient to unexpected shutdown? to store temporary collections of 100 folds (with the last 100, plus the preceeding 4 collections) - eliminating the oldest with a successful write of the newest) - so that at any crash on generation 49 (where things will move relatively slowly) - that we don't lose significant work?
(please balance the resiliency with speed of production - since if the resiliency features slow the client down by 1k structures a day, most of us would be better off running full bore, and just grumbling about the structures lost - since this is a fairly infrequent situation).
In all those cases, I lose the information on the 10k structures being worked on. Booting back up, I restart DF, and it continues on (while I grumble about losing 200-9800 folds.. and those higher values do hurt a bit.. ;)
When folks are running /nonet (by choice or not i.e. the cable modem needed to be rebooted to pick up the latest change in dns info) - and the system is unexpectedly shut down, they get messages similar to:
FATAL ERROR: [000.000] {foldtrajlite.c, line 760} Illegal file found in upload list
and we have to edit filelist to get rid of the entries that DF is hanging on - to get most of the stored collection to upload.
can we get an official program aka "crashedtest.exe" - that would rebuild the filelist, insuring that the matching set of .log.bz2 and .val.bz2 exist, and that they're in valid form and hold valid data?
----------
For the new algorithm - where our current results are based on the results of the previous generation - can you make the client a little more resilient to unexpected shutdown? to store temporary collections of 100 folds (with the last 100, plus the preceeding 4 collections) - eliminating the oldest with a successful write of the newest) - so that at any crash on generation 49 (where things will move relatively slowly) - that we don't lose significant work?
(please balance the resiliency with speed of production - since if the resiliency features slow the client down by 1k structures a day, most of us would be better off running full bore, and just grumbling about the structures lost - since this is a fairly infrequent situation).