bill
07-06-2004, 04:52 PM
I think I have a cluster working but my clients keep dying. At 1:05 one job died on a node, at 1:20 another died.
If I look at the dir:
node001:~/md5crk/node001/a> ls -alt
total 60
-rw-r--r-- 1 bill users 991 Jul 6 13:18 md5_perf.xml
-rw-r--r-- 1 bill users 12673 Jul 6 13:18 md5_send.xml
-rw-r--r-- 1 bill users 1050 Jul 6 13:10 md5_save.xml
drwxr-xr-x 3 bill users 4096 Jul 6 11:55 .
bill@node001:~/md5crk/node001/b> ls -alt
total 52
-rw-r--r-- 1 bill users 991 Jul 6 13:05 md5_perf.xml
-rw-r--r-- 1 bill users 12487 Jul 6 13:05 md5_send.xml
-rw-r--r-- 1 bill users 1050 Jul 6 12:47 md5_save.xml
drwxr-xr-x 3 bill users 4096 Jul 6 11:55 .
In neither case is md5_gui.log helpful or current. In neither case is CurrentDPCount at 200. In neither case is the daemon still running. Any ideas?
Oh, if I'm using md5_preset without a proxy do I skip the <proxy></proxy> line or just leave it
empty?
If I look at the dir:
node001:~/md5crk/node001/a> ls -alt
total 60
-rw-r--r-- 1 bill users 991 Jul 6 13:18 md5_perf.xml
-rw-r--r-- 1 bill users 12673 Jul 6 13:18 md5_send.xml
-rw-r--r-- 1 bill users 1050 Jul 6 13:10 md5_save.xml
drwxr-xr-x 3 bill users 4096 Jul 6 11:55 .
bill@node001:~/md5crk/node001/b> ls -alt
total 52
-rw-r--r-- 1 bill users 991 Jul 6 13:05 md5_perf.xml
-rw-r--r-- 1 bill users 12487 Jul 6 13:05 md5_send.xml
-rw-r--r-- 1 bill users 1050 Jul 6 12:47 md5_save.xml
drwxr-xr-x 3 bill users 4096 Jul 6 11:55 .
In neither case is md5_gui.log helpful or current. In neither case is CurrentDPCount at 200. In neither case is the daemon still running. Any ideas?
Oh, if I'm using md5_preset without a proxy do I skip the <proxy></proxy> line or just leave it
empty?