PDA

View Full Version : Auto update was a total disastor



DATA
04-15-2002, 03:08 PM
Hi ppl

how did your update tothe new client go

of the 48 systems within the S-MDC_TRUST not 1 auto uploaded

win98SE based systems just sat there showing Done
Win 2K Pro and XP just sat there the client had gone

total fiasco

am at the moment now downloading a copy of the file and going yet again to have to set up each system manually

i am beginning not to believe anything that is said about this auto update and next client change intednd to shut down al clients and await for the new client to be avaialble on the Site then do a manual download

very disappointed indeed

ulv
04-15-2002, 04:11 PM
Sorry to hear that, Data:cry:
I have only five machines- seven clients, all W2K, one had stopped, the other autoupdated nice and smooth.

Auritania
04-15-2002, 04:14 PM
Hey.. It wasn't a TOTAL disaster. I had 1 box stop and ask what to do (remind me why there is an autoupdate.cfg file with a 1 in it in my directory). One machine even crashed trying to update (win2k, nothing else installed). That machine just plain won't run the client at all.

FoBoT
04-15-2002, 04:24 PM
:confused:

hmmm, no problems here, all systems go

win95
win98
W2K pro
XP
linux

all did fine

sklepp
04-15-2002, 04:27 PM
Worked fine on OS X 10.1.3, up and running.
the memory clears itself out also.
Good Job!!!!!!
:cheers:

ScottMo
04-15-2002, 04:32 PM
Auto-update failed on all 3 Win2K boxes that were running at the time. The boxes that I stopped & started auto'd just fine. It seems that (for me at least) if a machine is running, it'll fail on the autoupdate & continue on with the old protien. When I get home from work, I'm sure that's what I'll find with the home machines. If I stop the client, wait for others to tell me the new one's up & restart, it works great.

pointwood
04-15-2002, 06:02 PM
My PC at home which is was running win2k and the client as a service when the change occured, updated just fine.

I'm not so sure about my laptop which is also running Win2k, but the client is not running as a service.

Brian the Fist
04-15-2002, 06:12 PM
I believe the auto-update is fully functional and working now, so if it didn't work it is likely a problem specific to your machine(s). It should update while the program is running, but it only checks once after every upload attempt so you have to wait until the next upload attempt and it should tell you a new version was detected.

The only known problem is on Win 2k/XP after restarting itself, you might not be able to exit by hitting Q. If this happens, try typeing the word 'exit' and hitting enter (while the client is running). Then hit Q a bunch of times until it finally quits. This is a peculiarity of NT's DOS emulation which we havent figured out yet and only occurrs the first time after an update so it is not too critical.

Nanobot
04-15-2002, 06:58 PM
From experience, this update and the previous, if I get the "INVALID PROTEIN" message then the automatic update does not take place. Over the two updates some of the machines that autoupdated last time did not this time and vice versa.

Just thought I would throw in my 2p worth.

DATA
04-15-2002, 06:59 PM
Hi ppl

think i found out why the auto update did not work - lack of bandwidth at one time had 29 systems all connected to DF and on each system the download failed due to corrupt client - we need in this country BT to get their finger out of there **** and get this facility up and running UK wide

did try restarting a cpl of system but the size of the new client 5.4oddMb just takes too long to download

so what i have done is

downloaded a copy of the client unzipped it into a folder edited the foldit.bat file by adding the -df switch then copied this folder to each system create a shortcut drag and drop to the desktop fire up and its up and running within mins

Howard

is there any way you can design a client that can run on a system and when you change the protein it only needs to download a certain amount of info rather than the whole file each time - could save hours of work for many ppl

Tom
S-MDC_TRUST

FlybyKnight
04-15-2002, 07:04 PM
I believe the auto-update is fully functional and working now, so if it didn't work it is likely a problem specific to your machine(s). It should update while the program is running, but it only checks once after every upload attempt so you have to wait until the next upload attempt and it should tell you a new version was detected.

FWIW: last several updates, my 4 w2k machines upgraded without a problem. Today ALL 4 failed.

Previously 4 of 5 98 machines were upgraded without incident.
( one apparently wasn't given the patch.)

5 of 5 98 machines failed this time. Even a clean install and patch application would not upgrade. :bang:

Paratima
04-15-2002, 07:49 PM
so if it didn't work it is likely a problem specific to your machine(s). I don't think so, Tim. I hear this from vendors all the time.

Linux - OK
NT4 - OK
W2K - 3 OK, 2 not (2 just stopped.)
W98 - forget it

The W98 client seems to be a VERY sensitive little thing. When I got home, one was sitting there saying DONE, the other was saying UPLOADING DATA TO SERVER. If it was, it was a helluva dump, lasting some 6 hours! :mad: We're making progress in some areas, but WIN98 needs to be on on the fixit list.

Brian the Fist
04-15-2002, 09:54 PM
It would be helpful when you say the update 'failed' if you told me exactly what happened/what you see. It worked on all of our Linux/2000/XP machines flawlessly. Also mention if you are 'always online' or using dialup. The software relies on communication over the internet for updates so if you dial up, it wont update until you are online when it is checking.

Also, the update is only the minimal files needed, about 3MB this time, not the full 6MB. However if you have a lot of machines and limited bandwidth, you may be better off manully updating since then you only have to download once.

LBaker
04-15-2002, 10:08 PM
Update failed on all 5 of my machines. 3 Win 98 and 2 ME. All were updated on Sunday with the latest client and have the autoupdate.cfg file. When I got home from work 4 machine had the client saying "done" and 1 was asking for permission. I'm on a cable modem with a constant connection on all machines.

KWSN_robegeor
04-15-2002, 10:10 PM
This one worked great for me. 17 instances (All NT 4.0 or 2000 server). All but two updated fine and I suspect the two failures had crashed prior to the update occuring.

DATA
04-15-2002, 10:25 PM
Originally posted by Brian the Fist
It would be helpful when you say the update 'failed' if you told me exactly what happened/what you see. It worked on all of our Linux/2000/XP machines flawlessly. Also mention if you are 'always online' or using dialup. The software relies on communication over the internet for updates so if you dial up, it wont update until you are online when it is checking.

Also, the update is only the minimal files needed, about 3MB this time, not the full 6MB. However if you have a lot of machines and limited bandwidth, you may be better off manully updating since then you only have to download once.

Hi Howard

above you ask us to tell you what happen-what we see - we have told you several times in the past and again to-day

on win98 systems all you see is "DONE or UPLOADING DATA TO SERVER" nothing else - what happens is nothing the system just site there

i have a dedicated 64K ISDN line on 24/7 (cant have a 128 ISDN as its to expensive) dont know all the workings of the client but an example - i have have had 54 instances of G@H running 24/7 automatically up and downloading without any problems but they are not using 3Mb data chunks - so it looks like the only solution for me at the moment if i want to let this project run automatically is to only run Win2K based systems and reduce the amount of systems on the project and divert the remainder to another project

Tom
S-MDC_TRUST

Paratima
04-15-2002, 10:53 PM
It worked on all of our Linux/2000/XP machines flawlessly. Yes, Howard. It worked on all (well, most) of mine also. I suspect that where it didn't work, there were complicating factors.

The problem is with Win98/ME. It apparently did the update, then froze in one of a couple different states: DONE and UPLOADING.

When I did get them running, BTW, I had to stop one, and Q'd it, whereupon it went into UPLOADING virtually forever. Looked as if it sent the data, didn't get the right response for whatever reason, so locked up waiting. A retry or two would be nice, maybe followed by a meaningful (to me, the dumb user ;) ) message.

I guess what I'm saying is, give us some better tools and we can do more to help you. :p Oh, all my boxen are cable 24 x 7.

Auritania
04-15-2002, 11:06 PM
All the "failed" systems were just churing along on the old protein like nothing had happened. Some machines had attempted to connect more than 26 times to upload units. All machines are running win2k. All machines are full time connected. All machines have an autoupdate.cfg with a single line with a 1 in it. All machines have an empty error log. All machines have -df set.

The only way I got them to update was to use "Q" and then restart them. They found the new version and prompted me "Y" or "N".

One machine (100% stock Dell Dimension P3-1Ghz) completely died. It is a dedicated DF box that has NOTHING but a clean install of win2k and the DF client. I try starting the new client on it and it reboots every time. Again, no errors recorded. Im writing this box off as posessed and I now keep it in a dark corner in the basement.

JTrinkle
04-15-2002, 11:17 PM
all 8 Win2k systems updated perfectly... :cheers:

the 5 Win98 boxes just sat there for hours (til I got home) waiting for me to move the mouse or something :confused: :confused: :confused:
then they updated perfectly.

I think you might wanna add a Win9x box to your test fleet. :swear:

-JTrinkle

one good thing: between this slow ass protein and the new 10 billion limit.... Its gonna be awhile before the next changeover :)

bwkaz
04-15-2002, 11:27 PM
My one Linux box updated perfectly.

... Well, that is, after it failed Sunday because the people at Intel named the file wrong... :mad: :swear: :bang:

But today, at just about noon, it uploaded what it had been working on, grabbed the new version, restarted, and has been running fine ever since.

Good job! :thumbs:

Shaktai
04-16-2002, 03:33 AM
My 4 boxes, two Win XP and two Mac OS-X all updated perfectly. So far the only update error I have experienced from the last two, stemmed from my typo on the autoupdate.cfg file.

pointwood
04-16-2002, 03:57 AM
My home PC running Win2k and the client as a service, updated just fine.

My laptop, also running Win2k but *not* the client as a service, didn't update. When I left work yesterday the changeover haven't happened (I live in Denmark - so I'm "a few" hours ahead of you guys ;) ), and the client was happily crunching. The changeover happened a few hours after I arrived at home. Right now, having just arrived at work, I find the client still happily crunching the old protein :swear:
I stop the client (hitting 'Q') and start it again and then it succesfully detected, downloaded and updated the client.

There is nothing in my error.log, I had it running in the weekend (on dialup which is why there is a lot of errors on Apr 14:


========================[ Apr 14, 2002 11:44 PM ]========================
ERROR: [000.000] {foldtrajlite.c, line 1011} Unable to check server status
ERROR: [777.000] {ncbi_socket.c, line 838} [SOCK::s_Connect] Failed SOCK_gethostbyname(anteater5.distributedfolding.org)
ERROR: [777.000] {ncbi_connutil.c, line 494} [URL_Connect] Socket connect to anteater5.distributedfolding.org:80 failed

=== Cut - A lot of the same lines - Cut ===

ERROR: [777.000] {ncbi_connutil.c, line 494} [URL_Connect] Socket connect to anteater5.distributedfolding.org:80 failed
ERROR: [777.000] {ncbi_socket.c, line 838} [SOCK::s_Connect] Failed SOCK_gethostbyname(anteater5.distributedfolding.org)
ERROR: [777.000] {ncbi_connutil.c, line 494} [URL_Connect] Socket connect to anteater5.distributedfolding.org:80 failed

========================[ Apr 15, 2002 8:39 AM ]========================

========================[ Apr 15, 2002 9:09 AM ]========================

========================[ Apr 15, 2002 4:08 PM ]========================

========================[ Apr 16, 2002 8:33 AM ]========================

========================[ Apr 16, 2002 8:35 AM ]========================


The laptop is on a very stable (and expensive...) 256K internet connection, so I don't believe our connection has been a problem.

The client have a correctly created autoupdate.cfg file.

What is further "interesting" is that even though it updated successfully, it haven't been able to upload any WU's the whole night and when the client updated, it didn't delete the old WU's either, which can be seen from this list of my distribfold dir:


18-03-2002 10:28 35.644 blpotential.txt
18-03-2002 10:28 506.906 bstdt.val
18-03-2002 10:28 5.240 cbdata
18-03-2002 10:28 4.254 CompatibilityNotes.txt
18-03-2002 10:28 1.871 copyright
18-03-2002 10:28 1.614 distrib-update.bat
12-04-2002 09:16 59 foldit.bat
15-04-2002 10:15 1.490.944 foldtrajlite.exe
12-04-2002 13:10 44.888 native.val
12-04-2002 13:10 3.605.883 protein.trj
12-04-2002 13:14 21.199 readme1st.txt
18-03-2002 10:28 2.803.905 rotlib.bin.bz2
18-03-2002 10:28 2.327 skel.prt
18-03-2002 10:28 12.432 sleep.exe
12-04-2002 13:14 3.318 whatsnew.txt
18-03-2002 10:28 3.142 zhangatm.txt
18-03-2002 10:28 2.926 zhangeij.txt
16-04-2002 08:35 51.073 error.log
01-02-2002 13:47 10 handle.txt
16-04-2002 08:35 5 foldtrajlite.lock
16-04-2002 09:40 0 file.txt
23-03-2002 19:59 6.400 ng8iezad_protein_0006124.val.bz2
24-03-2002 09:42 1 autoupdate.cfg
15-04-2002 17:03 10.602 ng8iezad_protein_0004725.val.bz2
15-04-2002 17:15 10.585 ng8iezad_protein_0005791.val.bz2
15-04-2002 17:06 222.421 fold_ng8iezad_0_protein.log.bz2
15-04-2002 18:11 10.588 ng8iezad_protein_0010985.val.bz2
15-04-2002 18:01 222.117 fold_ng8iezad_5000_protein.log.bz2
15-04-2002 19:42 10.545 ng8iezad_protein_0019270.val.bz2
15-04-2002 18:55 222.597 fold_ng8iezad_10000_protein.log.bz2
15-04-2002 20:38 10.684 ng8iezad_protein_0024324.val.bz2
15-04-2002 19:50 222.454 fold_ng8iezad_15000_protein.log.bz2
15-04-2002 21:26 10.613 ng8iezad_protein_0028704.val.bz2
15-04-2002 20:45 222.556 fold_ng8iezad_20000_protein.log.bz2
15-04-2002 22:15 10.562 ng8iezad_protein_0033210.val.bz2
15-04-2002 21:40 222.784 fold_ng8iezad_25000_protein.log.bz2
15-04-2002 23:04 10.582 ng8iezad_protein_0037655.val.bz2
15-04-2002 22:35 222.588 fold_ng8iezad_30000_protein.log.bz2
15-04-2002 23:45 10.564 ng8iezad_protein_0041346.val.bz2
15-04-2002 23:30 222.822 fold_ng8iezad_35000_protein.log.bz2
16-04-2002 01:18 10.588 ng8iezad_protein_0049723.val.bz2
16-04-2002 00:26 222.429 fold_ng8iezad_40000_protein.log.bz2
16-04-2002 01:56 10.610 ng8iezad_protein_0053221.val.bz2
16-04-2002 01:21 222.412 fold_ng8iezad_45000_protein.log.bz2
16-04-2002 02:39 10.541 ng8iezad_protein_0057162.val.bz2
16-04-2002 02:16 222.502 fold_ng8iezad_50000_protein.log.bz2
16-04-2002 03:40 10.579 ng8iezad_protein_0062718.val.bz2
16-04-2002 03:10 222.692 fold_ng8iezad_55000_protein.log.bz2
16-04-2002 04:24 10.545 ng8iezad_protein_0066709.val.bz2
16-04-2002 04:05 222.455 fold_ng8iezad_60000_protein.log.bz2
16-04-2002 05:36 10.589 ng8iezad_protein_0073293.val.bz2
16-04-2002 05:00 222.199 fold_ng8iezad_65000_protein.log.bz2
16-04-2002 06:36 10.560 ng8iezad_protein_0078740.val.bz2
16-04-2002 05:55 222.715 fold_ng8iezad_70000_protein.log.bz2
16-04-2002 07:19 10.536 ng8iezad_protein_0082737.val.bz2
16-04-2002 06:49 222.221 fold_ng8iezad_75000_protein.log.bz2
16-04-2002 08:17 10.524 ng8iezad_protein_0088010.val.bz2
16-04-2002 07:44 222.538 fold_ng8iezad_80000_protein.log.bz2
16-04-2002 09:03 12.386 ng8iezad_protein_0000850.val.bz2
16-04-2002 08:33 198.866 fold_ng8iezad_85000_protein.log.bz2
60 fil(er) 12.788.192 byte
2 mappe(r) 6.290.276.352 byte ledig
My foldit.bat looks like this:
.\foldtrajlite -f protein -n native -df -g 0

Even though you, Howard, says it checks for a new version every time it uploads, somehow that doesn't seem to always work :cry:

Let me again ask for the possibility to be able to upload old WU's after a changeover, if only for a limited time?! If it is not possible, then I would like a very clear answer as to why it isn't possible and a detailed description of how the update process is supposed to work. Not just for me, but more for others and new members - it would be nice to have a link you could refer to when that question is asked.

Hope you can use this info for something :)

ScottMo
04-16-2002, 07:10 AM
Originally posted by Brian the Fist
It would be helpful when you say the update 'failed' if you told me exactly what happened/what you see. It worked on all of our Linux/2000/XP machines flawlessly. Also mention if you are 'always online' or using dialup. The software relies on communication over the internet for updates so if you dial up, it wont update until you are online when it is checking.... Using Win2K on a DSL connection (running at 768K). The client is running doing old client work. Finishes the run & tries to upload. Returns the 908 error:
http://www.ameri-becca.com/incorrect.jpg

The client then starts back working on the old protein until it finishes the run and tries to upload. Gets the 908 error, and the cycle starts all over again until it hits the limit on unsent work & shuts down.

All 3 Win2K boxes do this. If I shut down the client, delete the filelist, and restart, the auto-update works fine, but that's not really auto-updating, is it?

Scoofy12
04-16-2002, 10:39 AM
Regarding the problem of DATA (and perhaps others) of internet connectivity and other related issues: this might be a good argument for setting up some sort of proxy system, if not for uploading, then at least for downloading updates. This way people with large farms (whom you surely want to attract anyway) and maybe without fast connections will not have to download several megabytes per client. this would also ensure that the clients had connections when they updated and prevent too much maintainence on the individual clients. Maybe you could even consider releasing the new protein data before the scheduled switch to allow people to test their proxy setup before all the clients come asking for it. (a good question was brought up though, as to why there is so strictly no overlap between proteins or client upadates at all... maybe this could be clarified a bit? maybe you could allow the old data for a few days after the switch... wouldnt that be better for the project to have the extra structures anyway? )
if the proxy were implemented for uploading too, the user could control when the upload happened to avoid using the connection when it might be needed for other things.

FWIW, all 24 of my linux clients, as well as a few 2k/XP services/text clients and a 98 box updated fine
:thumbs:

ohms18k
04-16-2002, 10:44 AM
All 3 Win2K boxes do this. If I shut down the client, delete the filelist, and restart, the auto-update works fine, but that's not really auto-updating, is it?

I didn't have to delete any files but i did have to restart the client on my W2K boxens then the autoupdate.cfg file worked :rolleyes:. I have a 24/7 connection. I had a big cache :bang: of files ready to go before the update and I tried for hours to upload that cache before the update, the error log showed timed out errors tring to connect to server. I have to assume the server was down before the update. Plus I was told to keep folding the 62 protein because of problems. Why tell me to keep running something and not get credit for it? :rolleyes:
There is way too much waste running this project, a lot of it is mine.

Brian the Fist
04-16-2002, 05:46 PM
I have tried running the April 5 version (the 62-residue protein) on Windows 2000 using -df and -g0 in foldit.bat and with the autoupdate.cfg in place. It built 5000 structures of the old protein (I disconnected by ethernet cable during startup so it wouldnt find the update straight away), then detected the update and installed it
and restarted itself with no trouble.

I can provide the old version if anyone feels like testing this on their system but I still cannot reproduce the situation described in this thread.

There was a brief time on Sunday when the incorrectly named (my fault, not Intel's) update files were up. Maybe for 10 minutes. If your machine happened to detect this update during this period, it would have said "no signature found" or if you had autoupdate.cfg in place it might have just terminated and said Done.

It is possible that this is what happened with some of you, and was due to human error in that case (namely, mine :cry: ).

I have made sure the same mistake will not occur again of course.

As for uploading old protein data, as I have mentioned previously, this is not possible with our current hardware and software setup. It is simply not practical to do this. Especially now with the larger sample size, please keep in mind we will be generating over half a TERABYTE of data. As soon as it is done, we need to take it offline and start analyzing it, and the DELETE it to make room for the next protein. It would be nice to have unlimited storage space, but we don't have it. Even if we did have enough storage though, our network arrangement (of which I do not wish to divulge any details for obvious reasons) is not compatible with having multiple proteins going to different places. If you are always online, the work lost should be fairly negligible and dial-up users can try to do an 'upload dump' when the thermometer starts getting full. If this is inconvenient though, then there may be some wasted CPU cycles and we cannot change that at the moment unfortunately.

Paratima
04-16-2002, 08:18 PM
:) And :) the :) Win98 :) difficulties? he said, :) trying to be as :) positive :) as possible. :)

DATA
04-16-2002, 08:30 PM
Hi Howard

no need for you to :cry: over what happened - this is the best management supported project of the last 6 i have undertaken - and at least you spk to us and keep advising and resolving the issues for this i for one am greatful :D even although i have probs at client changeover :(

it has taken a bit of getting used to to having to dump millions of construction at a client changeover but i understand fully your situation - just a pity about all that wasted cycles but for the S-MDC have come up with a plan for the next client changhe over that will totally eliminate this problem (am going to ensure all work is uploaded then transfer to another project prior to server shut down till the new client is available then carry out a manual update and away we go again - this should ensure no lost time or cycles) :cool:

anyways keep up the good work and have a cpl :cheers: on me :D

Angus
04-16-2002, 09:42 PM
I had a least of couple of Win98 boxes (on DSL) that tried to upload while the server was down for changeover on Monday, as well as some work machines (NT4 on a lan) that appear to have stopped after not being able to upload during the aborted switchover on Sunday.

Those machines just stopped when they couldn't find the correct upload server.

Does the Win98/NT client try to upload first, then look for a new version, or the other way around? If it looked for the new client before trying to upload, it could/should dump the now 'old' work and download the update, otherwise, it seems like it will always puke trying to upload to the old server address ...

ohms18k
04-17-2002, 12:24 AM
I have to say this project is interesting. Those proteins get done so fast I am scrambling to get them in. When I do look at that thermometer its either at the bottom or at the top (too late). With this longer protein maybe I will have time to get my stuff together.

I have to give you a worst-case scenario: If I had decided to put DF on 30 boxens and went on vacation for 3 weeks and this update had happen………I don’t know how many files or how many bytes of useless data that would have accumulated………but I know if that had happen how to get kicked off a BB. :D



:cheers:

Scoofy12
04-17-2002, 10:36 AM
Originally posted by ohms18k
I have to say this project is interesting. Those proteins get done so fast I am scrambling to get them in. When I do look at that thermometer its either at the bottom or at the top (too late). With this longer protein maybe I will have time to get my stuff together.
:cheers:

I expect that won't be so much of a problem with this protein... on the downside, my gain rate on the top 10 is now much slower :( ;)