Results 1 to 7 of 7

Thread: Fodder to chew on...

  1. #1

    Question Fodder to chew on...

    Science, as we are dealing with it, is not perfect. Computers develop "enhancements" all by themselves and occasionally fail. Computer programs contain bugs that once in a while just plain cannot be found. These are but a few of the certainties that we live with here in the DC community. However, there is one thing that I just cannot grasp.

    Although I doubt that anyone has attempted to break it, and based upon Howard's challenge that they could, it is apparent that the code to prevent cheating seems to be doing as it was designed to do. That code IMHO absolutely and positively belongs in the program. Now, that said, it is also apparent that the code also has a negative side to it, as it stands.

    Because of that code any flaw, glitch or whatever you may call it in the foldit.lst file results, in MOST cases, the total loss of all cached work. Frustration is something that scientists must learn to live with on a daily basis it goes with the territory. We, those who willingly and of our own accord donate our computers and the associated costs to run them to the various projects, generally do not deal with frustration on a daily basis. We do not have to deal with "Oh damn, I spilled my coffee in the petri dish and will have to start over from scratch", or "Oh my, the refrigeration failed and ruined my project." No, we only deal with the frustration of seeing the work generated by our own bought and paid for resources flushed down the drain because of some bug, error or whatever that results in completed production being rendered worthless. I am relatively sure that that frustration is without a doubt the cause of many having left our ranks here. I may be wrong but I sure doubt it.

    I do not profess to be computer hardware literate, at best I am a novice. I do not profess to be programming literate. I took one semester of some programming course a lot of years ago and likely do not remember a thing I learned. I do however possess a fair amount of common sense. I do know that if it starts to rain that one should either get an umbrella or get inside. That same common sense leads me to believe that if code can be written to curtail cheating, that there must be and likely is a way to expand that code such that a safe guard could be put in place to "back up" to the most recent previous "good/uncorrupted/redundant" file, other than using "purgelist" which seldom ever works, and that it could be uploaded. Perhaps I am thinking along the lines of a railroad track that has two separate tracks that never cross each other, a break in one does not render the other useless.

    I've lost my share of output over the past year and a half as have others. I've stuck it out and likely will in the future. This problem, in and by itself, seems to be the most irritating of late though and I know first hand of others who have thrown in the towel because of it. The DF powers that be have been extremely responsive since day one in addressing problems here. This one however seems to be a "That's the way it is. Like it or leave it." and I just do not understand why. There must be a solution to this without jeopardizing the security. What is it?

  2. #2
    Senior Member
    Join Date
    Sep 2002
    Location
    Meridian, Id
    Posts
    742
    I'll jump in here.

    I'm not planning to move off this project. But, I second the view exspressed here. I do not believe the Missing gen error or the make traj error have been fixed and I have filled Howards in box with the zipped folders of lost work to prove it. I also restart clients on a daily basis from the server com. problem.
    I'm willing to put the time in and continue to heard my sheep but I can see how these things would quickly drive off the beginer or the less dedicatied.




  3. #3
    I agree more or less with what you say. Do not confuse security with integrity however. Many of the checks and constraints we impose are to maintain the integrity of the data we receive on the server, and not to avoid cheating. The missing generation error indicates that you upload gen X but the server cannot find you gen X-1. Thus your simulation has a break in it somewhere. If we accepted this, it would result in a defective simulation, perhaps eventually misleading us to false scientific conclusions potentially. Our prime goal is to ensure all data accepted is correct, as if we had run it locally on a single machine here all along.

    Clearly we wish to avoid user frustration with these errors and have done our best. However, people do things we don't expect all the time, and in some cases may run programs that interfere with the client that we and probably even they do not know about. To date, every bug/error that we have been able to reproduce here had been corrected. The problem with this lingering missing generation or related errors is that no one seems to be able to tell us how to reproduce the errors. We can get limited information from examining the 'messed up' directories if you send them to us, but we really need to recreate the bug here in order to fix it.

    As far as I know, if you leave your computer alone running the client and it is connected to the netwrok 24 hrs/day or whenever it is powered on, you should have no problems. My computers all run that way and I have never had a problem. If you have a more 'non-standard' setup, problems could occur. I've done my best to accomodate things like faulty or overclocked system RAM, dial-up users, unexpected system crashes and reboots - frankly, I think few professional pieces of software deal with any of these situations very gracefully.

    Getting back to the original point though, we would love at least as much as you to get rid of the remaining, relatively rare, problems. Sure, we could do the Micro$oft solution, and automatically make a backup of the whole directory every generation, and use up an extra few gigabytes of your hard disk, to allow you to 'system restore' to an earlier point if you so desire. Personally I prefer the more elegant solution of 'fixing bugs'

    However to do so we still need your assistance. If you are receiving these errors frequently you should be able to reproduce them. Just stop and think - what am I doing that could be different, or what did I do just before it stopped working? If you can, try disabling ALL non-essential programs running on the machine and try again - if the problem occurs again, that is a useful piece of information - none of those programs are causing the problem. Ones to especially watch for are firewalls, 'security' programs and virus scanners and the like. Try bypassing hardware firewalls or proxy servers if you can too. If you can verify that you get the same error with or without some of these, e-mail us or post here and tell us! Then we know not to suspect that program. There are so many combinations of hardware and software these days that it is impossible to predict what could go wrong, especially on Windows environments. We have had very few problem reports on UNIX environments lately, though this may just be because of the user bias towards Windows.

    Im not sure if thats the response you were looking for but I hope it clarifies our stance anyhow.
    Howard Feldman

  4. #4
    Senior Member wirthi's Avatar
    Join Date
    Apr 2002
    Location
    Pasching.AT.EU
    Posts
    820
    If we accepted this, it would result in a defective simulation, perhaps eventually misleading us to false scientific conclusions potentially. Our prime goal is to ensure all data accepted is correct, as if we had run it locally on a single machine here all along.
    Of course you know that a lot better than me but why has the result to be incorrect just because you don't know how it was computed? I mean, I calculate a very good fold at generation 100; why do you care about my generation 99? Of course there was some kind of bug somewhere that led to the loss of generation 99 (at least that's what your server thinks) but that SHOULD NOT be connected to a corrupt generation #100. Of course it can lead to that, but that would be another major problem (and is not the case right now as of what I have read).

  5. #5
    Originally posted by Brian the Fist

    Im not sure if thats the response you were looking for but I hope it clarifies our stance anyhow.
    In all honesty I haven't a clue what I was looking for as a response. As I stated in my first post, I have had but a few problems since starting the project right at the beginning. One of which was that I transposed a couple letters in my handle which caused the loss of three days worth of work and that was entirely my fault.

    See if perhaps this makes any sense as a plausable workaround for the "Corrupt filelist.txt" problem. Yes, I am pretty much exclusively dealing with those of us that are not hardwired to the internet here. It is readily apparent, to even the most casual observer, that it takes but an instant for the server to decide that a filelist.txt file is good or corrupt. Obviously there is something it is looking for and if it is not exactly as it should be it gets rejected and all the work is just so much wasted resources.

    Now, how much trouble would it be for that check to be preformed at the end of every generation. Needless to say that check would increase the size of the working directory, but perhaps it is worth that.

    Generations 1 through 44 complete and there are no problems and the program flawlessly moved on to the next generation.

    At the completion of generation 45 when the "check" is done something is amiss.

    "IF", at the server level currently says "Oops, bad data. You are not allowed to come here, nor are any of your friends." and everything is rejected. With the "IF" being done at program operation level should that occur, the most recently completed work is deleted and generation 45 is rerun with the completed generation 44 as the base once again. I liken this to the current "Tight Spot" only in this case it is a "Road Temporarially Closed", you must detour back where you came from for a block and try again.

    I apologize if none of this makes any sense to you. It is clear as can be in the hollows of my mind.

    As stated in my first post I have no intention of leaving the project. I am just wondering why what I am thinking cannot work and eliminate one more point of potential aggrevation. If the answer is that it is just too damn much work to make it happen, so be it. At least I got an answer, not one that I may like, but none the less an answer. Then I can begin thinking of yet another potential solution. (insert cute icon with smoke coming out of ears here)

    You all are going a great job here and have been extremely responsive to our whims since day one. I applaud you for that.

    --
    Completed generations do not make members, rather it is members that make completed generations.

  6. #6
    For one thing, it is currently impossible to put gen 100 if gen 99 is missing, due to the way the system works. That aside, keep in mind we are now simulating the entire folding process. Thus we are interested in the entire dynamic simulation and not just the end point. These are to be compared with experimental folding results and other computational folding predictions. Thus missing steps are very bad.
    Howard Feldman

  7. #7
    Senior Member wirthi's Avatar
    Join Date
    Apr 2002
    Location
    Pasching.AT.EU
    Posts
    820
    Originally posted by Brian the Fist
    For one thing, it is currently impossible to put gen 100 if gen 99 is missing, due to the way the system works. That aside, keep in mind we are now simulating the entire folding process. Thus we are interested in the entire dynamic simulation and not just the end point. These are to be compared with experimental folding results and other computational folding predictions. Thus missing steps are very bad.
    That's the point I don't understand. I know that I can't compute #100 if I don't have #99 (that's obvious) but why should I not be able to UPLOAD #100 (just because #99 is missing?) That's like not being able to read an entire book just because one word in the middle is unreadable ...

    I didn't know you want the whole process of folding; I had thought only folding@home was interested in the process itself and that distributedfolding was just interested in the result (the fold). As I said you are the one that knows how things work, so we will have to accept how the system works.

    Still, that's an annoying bug ....

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •