Results 1 to 23 of 23

Thread: Trap ctrl-c so that it doesn't erase all production?

  1. #1
    The Cruncher From Hell
    Join Date
    Dec 2001
    Location
    The Depths of Hell
    Posts
    140

    Trap ctrl-c so that it doesn't erase all production?

    Is it possible to make it so that ctrl-c doesn't erase all produced WU stored on that system?
    I, uh, hit ctrl-c on the wrong keyboard, and sorta wiped quite a few out, and it's happened to others too.

    Any way to stop that from happening?

  2. #2
    dismembered Scoofy12's Avatar
    Join Date
    Apr 2002
    Location
    Between keyboard and chair
    Posts
    608
    I think this was discussed in another thread. The program catches SIGTERM etc but apparently thats what it does. I think howard said he'd add it to the list.

    Edit: Ah, here it is.

  3. #3
    The Cruncher From Hell
    Join Date
    Dec 2001
    Location
    The Depths of Hell
    Posts
    140
    I hope it's soon.

    Howard, care to comment?

  4. #4
    I realize a lot of people are used to using idiot-proof software from Microsoft, but they have had 20 years to figure it all out. I cannot accomodate every possible mistake or error someone could make with the software. If you made a boo-boo, its your boo-boo. As for it deleting all the work when a signal is caught, when the next release is made it should lose only the work since the last checkpoint (5000 or whatever) instaead of ALL work when a signal (seg fault, ctrl-C, TERM or otherwise).

    Someone suggested that it behave like when the service exits, but the difference is the Windows service has a second thread (the service manager) which can clean things up, while the normal text client does not. Yes, I could make it multithreaded (in fact its already compiled as a multithreaded app on most OSes) but I really don't see it as too worthwhile at the moment and would rather focus on more important issues.
    Howard Feldman

  5. #5
    The Cruncher From Hell
    Join Date
    Dec 2001
    Location
    The Depths of Hell
    Posts
    140
    Originally posted by Brian the Fist
    I realize a lot of people are used to using idiot-proof software from Microsoft, but they have had 20 years to figure it all out. I cannot accomodate every possible mistake or error someone could make with the software. If you made a boo-boo, its your boo-boo. As for it deleting all the work when a signal is caught, when the next release is made it should lose only the work since the last checkpoint (5000 or whatever) instaead of ALL work when a signal (seg fault, ctrl-C, TERM or otherwise).

    Someone suggested that it behave like when the service exits, but the difference is the Windows service has a second thread (the service manager) which can clean things up, while the normal text client does not. Yes, I could make it multithreaded (in fact its already compiled as a multithreaded app on most OSes) but I really don't see it as too worthwhile at the moment and would rather focus on more important issues.
    Gee, someone's friendly today

    I asked a simple question. Your answer, without the rant, is just fine.
    Only losing up to 5K is fine for me.
    Losing a ton of work because I typed a keystroke into the wrong keyboard is not.

    --Scott

  6. #6
    Member lemonsqzz's Avatar
    Join Date
    Sep 2002
    Location
    Montain View, CA
    Posts
    97
    Seemed like a right on response to me!!
    Not everything in the world is candy-coated..
    He just tells it like is....


  7. #7
    Originally posted by Scotttheking

    Only losing up to 5K is fine for me.
    Losing a ton of work because I typed a keystroke into the wrong keyboard is not.
    Hi,
    of course, if you inadvertently typed ctrl-c to shut down, you could probably save your old work (not the current 5K) by manually rebuilding filelist.txt and let the client upload the old structures on restart. Or all old *.bz2 (including .log.bz2) files really deleted when you press ctrl-c? I don't think so, but then, I never tried this ...
    Just a thought,
    Jerome

  8. #8
    Release All Zigs!
    Join Date
    Aug 2002
    Location
    So. Cal., U.S.A.
    Posts
    359
    Originally posted by Brian the Fist
    moment and would rather focus on more important issues.
    Does that include checksumming for the client esp. with the unaswered concerns about abuse of the duplicates feature?

    As was pointed out, one of those lost sets might be important. How do you know that in one of those lost sets a sub-8 RMS structure isn't there? Can the science and project afford to lose an important result?

    Best,

    RuneStar½
    The SETI TechDesk
    http://egroups.com/group/SETI_techdesk
    ~Your source for astronomy news and resources~

  9. #9
    Senior Member
    Join Date
    Mar 2002
    Location
    MI, U.S.
    Posts
    697
    runestar: The random number your client just picked was 100 for the placement of the next amino acid. Had it picked 101 instead, you would have ended up with a 4-angstrom RMSD structure. As it is, picking 100, this structure is 9 angstroms RMSD. Can the science and the project afford to lose this important result?

    I would hope so, since it loses those things with the generation of every protein. Just some minor changes to the random numbers, and you could have had an outstanding structure. But the code doesn't know this; if it did, we wouldn't need to be here.

    Not that I'm saying it's no big deal to lose work, I've done it, it stinks, and yes, a 2-angstrom protein could have been in that result set. But frankly, it's OK, because that's the way probabilities and correlation and stuff work. They expect to have a sub-8 RMS structure by the end, but if they don't, I suspect it won't be a huge deal. They'll have collected enough structures (10 billion is probably enough, I'd say ) to come up with something close anyway. Yes, it stinks to shut down the client unexpectedly and lose ~2500 results, but the project will persevere through such difficulties.

    If Howard is still reading this thread, about what percentage of the structures that could be generated by this current data set are below 8 angstroms (and obviously this would be a back-of-the-envelope number, I don't expect you to come back with an exact percentage, that's ludicrous)? Can you even tell us? I'd bet it's not less than one percent or so, so if we assume it's one percent, what's one percent of the total number of possible shapes of strings of 105 elements? Even just assuming there are 4 possibilities for the next AA's placement (which is probably wrong), there are 4^105 different structures total. That's like 10^63 or so. So there are (by my guesses, which are nothing more than guesses, and completely uninformed ones at that), still around 10^61 structures that are below 8 angstroms. We'll see another one.

  10. #10
    Release All Zigs!
    Join Date
    Aug 2002
    Location
    So. Cal., U.S.A.
    Posts
    359
    I suppose you bring up some valid points, and hopefully people aren't JUST doing the project for the numbers. =)

    Isn't it a reasonable concern though that the lost work was in vain? I suppose the odds may be against us that that sub-8 angstrom was in that lost work, but it doesn't make it impossible. There's that doubt about it. To be fair also, there's also the lost CPU time. Admittingly it might have been showing flying toasters or swimming (we hope) fish anyways... we might be rather attached to those flying fish and swimming toasters...and we gave them up so we wouldn't lose cycles for DF.

    I suppose Brian should write up more details, especially in regard to what you mentioned. Forums are a nice place to interact, but I people would like to see more information on the inner workings.

    One other reason for the checksumming though is the duplicates feature. Howard didn't really address how this was being checked for abuse. I think it would calm a lot of minds if we knew that at least a rudimentary system to check for abuse.

    <clink, clink> My two cents worth... Substitute lowest local denomination for you country as needed...

    RS½
    The SETI TechDesk
    http://egroups.com/group/SETI_techdesk
    ~Your source for astronomy news and resources~

  11. #11
    Senior Member
    Join Date
    Mar 2002
    Location
    MI, U.S.
    Posts
    697
    Yes, losing work and CPU time does suck. I've done it a couple of times. But I don't complain, because:

    1) It doesn't happen often.

    2) When there are valid bugs in the software (the one a couple of versions ago, where it would delete queued results when it checked for updates, is the one that springs to mind), Howard is generally really good at fixing them. And, last but not least,

    3) Whenever I've lost work in the past (and YMMV, of course), it's been my fault. When it happens, I think "Wow. Deleting the files that it stores in /tmp, while it's running, was really stupid. I should probably never do that again!" and I don't. I think of it like the doctor joke: "A man goes to the doctor and says 'whenever I bend my finger like this, it hurts!' And the doctor says 'Well, don't do that then!'"

    Of course, this is all just my opinion. It may be complete :bs:

  12. #12
    Runestar: I don't know what checksumming you are talking about here however I can assure you no duplicate structures are being stored anywhere.

    As for the other point, a number of people have tried to make this (invalid) point in regards to lost or rejected work. What you need to understand is that the experiments we are doing test a probabilistic method. We are not looking for a cure for cancer, or any other disease. We are testing a method. If we ask the question 'What is the best RMSD we can get in a sample size of 10 billion?' then we want 10 billion structures, not 10 billion and one, or whatever. If there was an amazing 2A structure that was lost somehow and the next best one was 7A, so be it. the 2A would have in that case been a 'fluke' and likely ignored by us anyways. The RMSD should fit a smooth distribution when plotted for all structures, and there should be a continuum of RMSD's. Note how close together the top 10 RMSDs are right now.

    Thus any individual structure has little meaning here. It is only when we look at them together in the context of the sample size that they take on significance to help us answer the questions we are trying to ask. It might be nice to have a 2A structure, but if all the rest were 8A and up, it wouldn't really prove much except that someone got very lucky. I hope this makes the issue a bit clearer to some of you.
    Howard Feldman

  13. #13
    Release All Zigs!
    Join Date
    Aug 2002
    Location
    So. Cal., U.S.A.
    Posts
    359
    Hey Howard,

    Fair enough, and although we're not just it in for the numbers, you have to admit that the numbers part plays a good part in encouraging people to crunch. =)

    If we're going to crunch structures, we'd like to make sure we get the appropriate credit and that's not going to happen if the client is interrupted unexpectedly. While this might not be as bad for those with whole fleets of PCs at their disposal, it does make a difference for those who may have a couple (if that) at their disposal.

    There's just a feeling of futility when we see these bz2 files sitting in the directory that won't ever get sent up. Basically, "Why did I just contribute all that PC time to them if they aren't are going to be reported in?"


    As for the checksumming, I was referring the thread regarding the feature that allows duplicate results to be sent up in case of a send failure...

    TTFN and thanks for the continual feedback,

    RuneStar½
    The SETI TechDesk
    http://egroups.com/group/SETI_techdesk
    ~Your source for astronomy news and resources~

  14. #14
    Another point that nobody mentioned about trapping the "CTRL-C" is a problem that I am currently seeing with some of my mutli-user boxes. It would be great if Howard could modify the client so you just lose the current block you are working on instead of all the files.

    My situation:

    We have multi-user, multi-OS systems with Linux and Windows. They boot into whatever OS they need at the time. Now I have the DF client running in the background on these machines and they are offline so the results files start building up. Every time they reboot the machine the DF client catches a Sig15 and deletes all the work. So if they have happened to reboot into Windows before I have had the chance to collect the results (which has been happening a lot since people reboot at random intervals) I lose a whole day or more of work. So this is a real-life situation that is causing a problem instead of just an oops.

    Anyway, looking forward to the next release with this problem mostly fixed.

    Jeff.

  15. #15
    Jeff,
    that change has already been made (but not released until we change proteins). You'll only lose the most recent block of work, the rest will be buffered. Saving the last piece of work is trickier because it could stop, for example, in the middle of printing a line in the log file leading to a corrupt file. So you still lose the last bit of work.
    Howard Feldman

  16. #16
    That's great. I don't really care if I lose one block of data, its just the 1-2 days worth of data I have been losing that are more troubling.

    As always, we are happy you listen to the users and respond quickly.

    Jeff.

  17. #17
    Senior Member
    Join Date
    Jul 2002
    Location
    Kodiak, Alaska
    Posts
    432
    The directions that Team Stir Fry was sharing for uploading "orphaned" files didn't help save those 1-2 days worth of work?

  18. #18
    Originally posted by tpdooley
    The directions that Team Stir Fry was sharing for uploading "orphaned" files didn't help save those 1-2 days worth of work?
    No, the problem is that there are no "orphaned" files, the client deletes all the .bz2 log and val files so there is no way to manually create a filelist.txt.

    Jeff.

  19. #19
    Is the client still losing all work when rebooted ?
    When I reboot because the system hangs it always seems to start over at 4999. Right now I've 13 .bz2 files.

  20. #20
    Jeff,

    You might want to try using dfQ (especially if you've got one machine on your network that you can keep from being rebooted). It's just been released in beta, but see if it'll run unobtrusively enough for ya.

    You can set it to harvest the files every couple of minutes.

    Moti
    Last edited by m0ti; 02-23-2003 at 02:32 PM.
    Team Anandtech DF!

  21. #21
    Originally posted by Brian the Fist
    Runestar: I don't know what checksumming you are talking about here however I can assure you no duplicate structures are being stored anywhere.

    As for the other point, a number of people have tried to make this (invalid) point in regards to lost or rejected work. What you need to understand is that the experiments we are doing test a probabilistic method. We are not looking for a cure for cancer, or any other disease. We are testing a method. If we ask the question 'What is the best RMSD we can get in a sample size of 10 billion?' then we want 10 billion structures, not 10 billion and one, or whatever. If there was an amazing 2A structure that was lost somehow and the next best one was 7A, so be it. the 2A would have in that case been a 'fluke' and likely ignored by us anyways. The RMSD should fit a smooth distribution when plotted for all structures, and there should be a continuum of RMSD's. Note how close together the top 10 RMSDs are right now.

    Thus any individual structure has little meaning here. It is only when we look at them together in the context of the sample size that they take on significance to help us answer the questions we are trying to ask. It might be nice to have a 2A structure, but if all the rest were 8A and up, it wouldn't really prove much except that someone got very lucky. I hope this makes the issue a bit clearer to some of you.
    Howard could you clearify on what you mean here? If we are not looking for a cure with this project what exactly are we doing? A lot of my teammates are putting in a lot of resources b/c we believe this is a research that's directly related to medical research in finding a cure of cancer and other diseases. What you said above is making me, as well as my teammates, very confused.

    Thank you.

  22. #22
    Originally posted by Tawcan
    Howard could you clearify on what you mean here? If we are not looking for a cure with this project what exactly are we doing? A lot of my teammates are putting in a lot of resources b/c we believe this is a research that's directly related to medical research in finding a cure of cancer and other diseases. What you said above is making me, as well as my teammates, very confused.

    Thank you.
    I think I can help you slightly here. Basicly we are proving and tweaking the software to work accurately here, and while we do so we are currently working on known proteins as a control while we do so. Once the software algorithem is perfected, it can be used to predict all sort of different protein sequences's shapes once folded. This information can be used by researchers to develope a variety on new medicines to treat diseases. Some additional useful information about the science behind the project can be found here on our team's website.
    Last edited by Aegion; 02-26-2003 at 08:22 PM.
    A member of TSF http://teamstirfry.net/

  23. #23
    I would suggest you read the About section of the web site, as well as the TraDES web site (http://bioinfo.mshri.on.ca/trades/) for details on our methods and on our goals. What we are doing though is referred to as 'basic research'. You have to invent the wheel before you can invent the car, so to speak. The protein folding problem is an extremely general problem with applications to countless specific medically related problems. Thus our research will not directly cure any particular disease, but will INDIRECTLY help cure many many diseases. If and when we are able to reliably predict protein sturctures, it will allow others to design drugs for various proteins more rapidly. drugs can be made which are more potent, and have many less side effects, than the drugs that are commonly used today.


    If you have any more questions on this, don't hesitate to ask, it is a very important subject of course.
    Howard Feldman

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •