Page 1 of 2 12 LastLast
Results 1 to 40 of 49

Thread: Ticketing system details - changes in the new client and server

  1. #1

    Ticketing system details - changes in the new client and server

    The changes that have been made to the client in order to accomodate the new server setup are as follows:

    - the file upload is now processed one fileset at a time. This means that a fileset must be uploaded and fully validated on the server, and the client must be informed of the result before any further uploading can be done

    - if a fileset is not processed on the server for an X amount of time (measured in seconds), the client will retain a receipt, which is stored on disk, in the installation directory, in a file named receipt.txt. The client will then continue to work locally (unless running with -ut) until it is time to upload again

    - when the client needs to upload, and a receipt.txt is present, the client will first check the status of that previous upload, and will not proceed with the new upload until the receipt can confirm the completion of the previous fileset (this will loop back to the logic above)

    - if receipt.txt is present, this means that the last uploaded fileset is still kept locally on disk in case the server processing times out and the fileset needs to be resent. The filelist.txt is changed in accordance

    For more information, please see a detailed article regarding the ticketing system architecture here: http://www.onlamp.com/pub/a/onlamp/2...et_system.html

    All of the above may be changed in the future according to user feedback. If you have any suggestions, please let us know.
    Elena Garderman

  2. #2
    I enjoyed reading this. It really shows the care and hard work going into this project behind the scenes. All of the naysayers should give this a read *cough*Michael H.W. Weber*cough*

    A couple questions:
    -You guys have said multiple times that you don't really have funding for bringing in new hardware, has this changed?
    -Does this solution address the local storage space problems some users were having at the beginning of this fast protein? It looks like you still won't accept another generation until the previous one has been completely crunched. Or are you expecting the performance increase to eliminate that problem in the future?

  3. #3
    We really need to just try it out now and see how it works for people. We are certain we will get your feedback soon enough and then we can decide how we will proceed and what changes need to be made if any, from there.
    Howard Feldman

  4. #4
    Big Fat Gorilla guru's Avatar
    Join Date
    Dec 2001
    Location
    Warren, OR
    Posts
    501
    Sounds like a great way of balancing the load on the database by caching the work being uploaded. However if the database is backed up it will limit the speed of which a single client can upload (ie waiting for reciept from server) but will still allow other connections. (client not waiting for reciept from server). So basically it allows for more clients to connect at a time but at a reduced speed of uploading different datasets.

    Sounds like it will be a big help for us farmers who have large data set's to upload. It doesn't matter if your data set has 5 or 500 units, it get's cached until the next upload attempt.

    This caching of work should help out in regulating the work being sent to the database so you can control the number of inserts at a given time.

    guru
    I'm having fun!!! I'm just not sure if it's net fun or gross fun.

  5. #5
    Alive and XXXXing
    Join Date
    Nov 2003
    Location
    GMT +3
    Posts
    55
    Hmm - how about sneakernetting?

    As of the 22nd, the information flow is only one way, which, in my case, means that every morning I:

    1) (At home) Copy the DF directories onto a flash drive or CDRW
    2) (At home) Purge the directory

    3) (At work) Copy the directory onto the work machine (deleting all files left from last time)
    4) (at work) Upload with -u t

    With this new ticketing system, it looks like this is going to get MUCH more complicated.

    Please, someone, explain how this is going to work. I know I'm not the only one who sneakernets around here.

  6. #6
    Senior Member
    Join Date
    Jun 2003
    Location
    Windsor, England
    Posts
    950
    Darn good point, it will even affect those of us who use two directories.
    Still lookin’ forward to the update today, I think they always should be done 16:00 EST on a Friday.

  7. #7
    Big Fat Gorilla guru's Avatar
    Join Date
    Dec 2001
    Location
    Warren, OR
    Posts
    501
    It shouldn't affect you at all. The ticketing system is based on the upload directory not your user name. Say you have 4 directories used to upload results. Each one will upload the data and possibly get a ticket if the server is busy. The directories that get the tickets will simply be waiting to upload the next result set until the ticket is removed. It's not really any different then uploading with multiple directories now. If the server was busy the client would just time out and you whould have to restart the client to continue the upload.

    The advantage here is that the entire result set would be uploaded before the upload client would stall waiting on the server. Right now it's just part of the result set.

    guru
    I'm having fun!!! I'm just not sure if it's net fun or gross fun.

  8. #8
    Alive and XXXXing
    Join Date
    Nov 2003
    Location
    GMT +3
    Posts
    55
    It shouldn't affect you at all. The ticketing system is based on the upload directory not your user name.
    You missed the point. Currently I:

    1) (At home) Copy the DF directories onto a flash drive or CDRW
    2) (At home) Purge the directory

    3) (At work) Copy the directory onto the work machine (deleting all files left from last time)
    4) (at work) Upload with -u t

    What do I do with the tickets after step 4? Is the client at home waiting until evening for me to bring the ticket? In other words, how is this going to work for sneakernetting?

  9. #9
    Tickets are only distributed if the server is unable to process your upload request. If you've uploaded everything you won't have any tickets.

    The presence of a ticket simply tells the client, "Stop trying to upload, go back to crunching. Come back later." Since you aren't trying to upload from home it won't be looking for tickets at all.

    After you have sucessfully uploaded all generations you won't have any tickets remaining.

  10. #10
    Senior Member
    Join Date
    Jun 2003
    Location
    Windsor, England
    Posts
    950
    does th -ut switch work the same way? Other wise i would end up with two instances running on one CPU. and i can see Xelas's point, if he can't complete a full up load during work hours, then the next day even more work bought in on CD.

    But hey, we will have to wait and see.

  11. #11
    Big Fat Gorilla guru's Avatar
    Join Date
    Dec 2001
    Location
    Warren, OR
    Posts
    501
    Ok, here goes nothing.

    Old way:
    Assuming that you have a data set of 10 results to upload. The client uploads one result at a time. The server gets the result, processes it to make sure it's valid, then inserts it into the database. The server then responds to the client that it has completed the task and asks for the next result. If the connection times out the client stops the upload. If the client is setup to upload only then it simply stops. Otherwise it would just start processing until the next upload attempt. The next time it tries to connect it starts the process over with the last result that was not uploaded successfully.

    New way:
    Using the same assumptions as about the client would connect and send all 10 results to the server. The server would cache the results and do the verification and insert them into the database. If the time it takes to do all this exceeds their set timeout, a ticket is handed back to the client as a receipt for the work submitted. The client will then stop if setup to upload only or continue processing data until the next upload attempt. The server will continue working on the data that you submitted. The next time you connect if you have a receipt file it will check with the server to see if you data was processed successfully. If so then it uploads the next data set. If the first data set is still processing then the client will be turned away because the server is still busy. If the data for the receipt was lost or not processed then the data will be retransmitted to the server to be processed.

    That's as simple as I can describe it. This is how I understand it from reading this posted by Stardragon. If I got it wrong please correct me.

    guru
    I'm having fun!!! I'm just not sure if it's net fun or gross fun.

  12. #12
    Guru - actually, you're a little bit off. In the new way, all 10 results would still not be uploaded to the server. What would happen though is if the particular fileset currently being uploaded is taking too long to validate. you will get a receipt and be able to do useful work offline, instead of getting a timed out connection which results in missing generation errors. This will also allow us to cache uploads properly and balance the load on the database without inconeniences to the user.

    Nothing has changed in the actual upload process from the user side. The -ut flag still works exactly the same - if validation is unavailable, it will simply quit (similar to a failed upload or a broken connection). The only effective change visible to the users would be the presence of the receipt file on the working directory, and fewer timed out connections in the error log.
    Elena Garderman

  13. #13
    Big Fat Gorilla guru's Avatar
    Join Date
    Dec 2001
    Location
    Warren, OR
    Posts
    501
    Thanks, Elena. Any idea on how much longer before the update is ready?

    guru
    I'm having fun!!! I'm just not sure if it's net fun or gross fun.

  14. #14
    OCworkbench Stats Ho
    Join Date
    Jan 2003
    Posts
    519
    Is the Ticketting system going to cause an issue with Uploading the Buffered 58 Protein Gens, cause people are getting funny error messages
    Last edited by Grumpy; 04-22-2004 at 08:24 PM.
    I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.

  15. #15
    Big Fat Gorilla guru's Avatar
    Join Date
    Dec 2001
    Location
    Warren, OR
    Posts
    501
    So far I give the ticket thing a F grade! The server is up and running but I'm just not getting any results in!

    guru
    I'm having fun!!! I'm just not sure if it's net fun or gross fun.

  16. #16
    Junior Member
    Join Date
    Jul 2003
    Location
    Carbondale, Colorado, USA
    Posts
    18
    Originally posted by guru
    So far I give the ticket thing a F grade! The server is up and running but I'm just not getting any results in!

    guru
    Ditto on that. Every single one of my clients is buffering. None will upload!
    www.hardcoreware.net

  17. #17
    OCworkbench Stats Ho
    Join Date
    Jan 2003
    Posts
    519
    Without a doubt the worst updater ever. No buffered files will upload. All Client that update either 910 error or 908 Wrong Protein. So manually had to DL Client and do New Directories on all Boxen. I have gone a whole day with no Uploads old or new, no points, no nothing. I am typing this 1 handed..why..the stress has given me a seizure and I am currently paralyzed on my left side...not funny :bs:
    I am not a Stats Ho, it is just more satisfying to see that my numbers are better than yours.

  18. #18
    Alive and XXXXing
    Join Date
    Nov 2003
    Location
    GMT +3
    Posts
    55
    Originally posted by Stardragon
    Guru - actually, you're a little bit off. In the new way, all 10 results would still not be uploaded to the server. What would happen though is if the particular fileset currently being uploaded is taking too long to validate. you will get a receipt and be able to do useful work offline, instead of getting a timed out connection which results in missing generation errors. This will also allow us to cache uploads properly and balance the load on the database without inconeniences to the user.

    Nothing has changed in the actual upload process from the user side. The -ut flag still works exactly the same - if validation is unavailable, it will simply quit (similar to a failed upload or a broken connection). The only effective change visible to the users would be the presence of the receipt file on the working directory, and fewer timed out connections in the error log.
    Elena:

    If I upload with -ut, and the upload was NOT successful, I GET a reciept, and all files that I uploaded REMAIN on the machine. That means that there is a filelist.txt left hanging around as well.

    If I upload with -ut, and the upload was WAS successful, I DON'T get a reciept, and all files that I uploaded get DELETED from the machine. That means that there is STILL a filelist.txt left hanging around as well.

    Right?

    Then, the next day, I bring the next fileset for uploading with -ut. It has it's own set of files, and therefore, it's own filelist.txt. Before the new system, I just wiped the old stuff out and replaced everything. What Do I do with the old fileset (from the previous day) now? The new fileset?

    Sorry for sounding like a broken record, but I am NOT interested in how the new system works (to a certain extent). I AM interested in having a simple, DF for Dummies , step-by-step instructions or algorithm for how to properly sneakernet.

    That's it.

  19. #19
    Senior Member
    Join Date
    Mar 2002
    Location
    MI, U.S.
    Posts
    697
    Xelas -- put the new day's results into a different directory. Either that, or try to keep the filelist.txt's separate. You won't be able to just run the client in upload-only mode to flush out all previously buffered results, though; you'll have to figure some kind of "loop while filelist.txt has more than 4 lines in it" or something.

    Make sure you sleep for a few minutes (I'd do 10 myself, but whatever) between upload attempts, to give the servers a chance to process your upload and verify the receipt.

    May I use this opportunity to ask again for an exit status from DF? Something like:

    exit(0) for no upload files left
    exit(1) for uploaded one set, server busy, got a receipt
    exit(2) for receipt not validated yet

    Or some documented exit status scheme, anyway. Something other than 0 all the time. This way, we could just loop until the DF process's exit status is 0.
    "If you fail to adjust your notion of fairness to the reality of the Universe, you will probably not be happy."

    -- Originally posted by Paratima

  20. #20
    Alive and XXXXing
    Join Date
    Nov 2003
    Location
    GMT +3
    Posts
    55
    Originally posted by bwkaz
    Xelas -- put the new day's results into a different directory. Either that, or try to keep the filelist.txt's separate. You won't be able to just run the client in upload-only mode to flush out all previously buffered results, though; you'll have to figure some kind of "loop while filelist.txt has more than 4 lines in it" or something.

    Make sure you sleep for a few minutes (I'd do 10 myself, but whatever) between upload attempts, to give the servers a chance to process your upload and verify the receipt.
    "put the new day's results into a different directory."
    1) But then can I safely delete the previous directory? If yes, when? I assume that it's safe when filelist.txt is "empty"?
    Right?

    2) If the previous directory is not empty, then there is no point in uploading the next fileset, as it won't be accepted until the previous one was processed.
    Right?


    May I use this opportunity to ask again for an exit status from DF? Something like:

    exit(0) for no upload files left
    exit(1) for uploaded one set, server busy, got a receipt
    exit(2) for receipt not validated yet

    Or some documented exit status scheme, anyway. Something other than 0 all the time. This way, we could just loop until the DF process's exit status is 0.
    Currently I scan error.log. You can search for particular errors if they are added to the error.log file. Very easy to implement using "find" command in bat files. Let me know if you need an example.

  21. #21
    Peaches Moogie's Avatar
    Join Date
    Mar 2002
    Location
    Peachville
    Posts
    2,463
    Blog Entries
    3
    Originally posted by Rbreb13
    Ditto on that. Every single one of my clients is buffering. None will upload!
    Same here.





    irc.aknarra.net #lobby
    irc.free-dc.org #free-dc

  22. #22
    Social Parasite
    Join Date
    Jul 2002
    Location
    Hill Country
    Posts
    94
    Originally posted by Stardragon
    ... In the new way, all 10 results would still not be uploaded to the server. What would happen though is if the particular fileset currently being uploaded is taking too long to validate. you will get a receipt and be able to do useful work offline, instead of getting a timed out connection which results in missing generation errors. This will also allow us to cache uploads properly and balance the load on the database without inconeniences to the user.

    Nothing has changed in the actual upload process from the user side. The -ut flag still works exactly the same - if validation is unavailable, it will simply quit (similar to a failed upload or a broken connection). The only effective change visible to the users would be the presence of the receipt file on the working directory, and fewer timed out connections in the error log.
    Elena, your scenario makes two assumptions: (1) that the session that receives a ticket back has other useful work that it can perform, and (2) that when the client is next ready to contact the server, there will exist an on-line connection for it to use.

    I'm on dial-up, meaning setting up the path for (2) takes manual effort. It is MUCH easier for me (I'm using '-ut') to immediately restart the client (for another attempt at the server) than to have to re-dial after waiting a bit.

    And why am I using '-ut'? Because I don't want my cpu idling while each upload takes place. So I have *another* session to FULLY ('-i f') utilize my cpu. The session that receives the ticket has __no__ useful work it can do, except to try to upload again.

    .

  23. #23
    Senior Member
    Join Date
    Mar 2002
    Location
    MI, U.S.
    Posts
    697
    Originally posted by Xelas
    But then can I safely delete the previous directory? If yes, when? I assume that it's safe when filelist.txt is "empty"?
    That's how I understand it, yes.

    If the previous directory is not empty, then there is no point in uploading the next fileset, as it won't be accepted until the previous one was processed.
    Right (again, as I see it).

    Currently I scan error.log. You can search for particular errors if they are added to the error.log file. Very easy to implement using "find" command in bat files. Let me know if you need an example.
    Actually I run Linux, so it'd be grep not find. But a return value means you can:

    Code:
    while /bin/true ; do
        ./foldtrajlite -f protein -n native -ut
    
        status=$?    # grab the return value
    
        if [ $status -eq 0 ] ; then
            # Uploading is finished, nothing else in filelist.txt
            exit 0
        elif [ $status -eq 1 -o $status -eq 2 ] ; then
            # Either it got a receipt, or it's waiting for one to finish; delay
            sleep 6000    # 10 minutes
        else
            # some other error condition... this may never happen though
            exit $status
        fi
    done
    I'd be able to name that file upload.sh, and run it. A few hours later (depending on the number of gens in filelist.txt), it will have uploaded everything.
    "If you fail to adjust your notion of fairness to the reality of the Universe, you will probably not be happy."

    -- Originally posted by Paratima

  24. #24
    Boinc'ing away
    Join Date
    Aug 2002
    Location
    London, UK
    Posts
    982
    with this new ticketing system will the 'buffered gens' in progress.txt reflect this (as dfGui, dfMon and others use progress.txt to show buffered gens) - as it seems to indicate 1 more gen buffered than has been uploaded (I assume due to the 'hang on to the old gens' part)...

  25. #25
    Boinc'ing away
    Join Date
    Aug 2002
    Location
    London, UK
    Posts
    982
    another thing I can see about this new ticketing system is that it is a bit off-putting/depresasing to see a client have x amount buffered with no indication if it is a client problem or awaiting confirmation back from the ticket system - any feedback on if a message can be added to the error log saying something along these lines?

  26. #26
    Christmas Lighterer!
    Join Date
    Mar 2004
    Location
    Upstate NY, USA
    Posts
    556
    Blog Entries
    1
    Originally posted by pfb
    another thing I can see about this new ticketing system is that it is a bit off-putting/depresasing to see a client have x amount buffered with no indication if it is a client problem or awaiting confirmation back from the ticket system - any feedback on if a message can be added to the error log saying something along these lines?
    Agreed. I'm pulling one or two hairs out because it looks like a lost cause right now... even though it may not be?
    Our Christmas Lights:
    MaineLights.org

  27. #27
    Member
    Join Date
    Apr 2002
    Location
    Denmark
    Posts
    45
    Originally posted by Jeff
    Agreed. I'm pulling one or two hairs out because it looks like a lost cause right now... even though it may not be?
    Fear not. I have just watched my client doing the first 20 or so generations, buffering all of them locally and having a receipt.txt with a timestamp dating from when it first attempted upload. Just recently that changed. Apparently it managed to upload and confirm/validate 4 generations and thus updated receipt.txt - both content and timestamp.
    All of this happened without anything written in error.log, so an empty error.log combined with buffered generations is not necessarily a bad thing. It just means that the backend server is busy (probably processing other peoples work).

    I also got points for those 4 comfirmed/validated generations, and all of this with only a few seconds of idle time between generations.

    /Mighty

  28. #28
    Fixer of Broken Things FoBoT's Avatar
    Join Date
    Dec 2001
    Location
    Holden MO
    Posts
    2,137
    listen to ^ mighty ^

    some of my boxen somewhere are uploading new work or i wouldn't have any points!


    i think it will be ok once we get used to it

    however the idea of having some type of feedback from the server as to how many tickets are ahead of each client or essentially how long of a line there is would be sweet
    Use the right tool for the right job!

  29. #29
    Registered User Morphy375's Avatar
    Join Date
    Jun 2003
    Location
    Regensburg, Germany
    Posts
    81
    On my testbox DF is running smooth now. Within half a day it crunched 100 gens and uploaded 19 gens. If this upload speed continues I have to look for a bigger HD....

    But what HD's do I need if I start my little farm again?

  30. #30
    Boinc'ing away
    Join Date
    Aug 2002
    Location
    London, UK
    Posts
    982
    definately be nice to know some feedback on the tickets / queue system - I've now got 625 gens buffered and no sign of them uploading...

  31. #31
    Originally posted by pfb
    definately be nice to know some feedback on the tickets / queue system - I've now got 625 gens buffered and no sign of them uploading...
    About 20% of my gens of the current protein, on multiple systems, have been uploading, the rest are caching. I presume the old protein is still hogging the upload servers. If not we are in trouble unless we can get a 500% increase.
    OCAU

  32. #32
    I'm getting 10% of what I produce going up and the rest just building up. I this isn't down to old client load but rather a ticket/receipt/process problem, we're never going to be able to catch up.

  33. #33
    Senior Member
    Join Date
    Jan 2003
    Location
    North Carolina
    Posts
    184
    Originally posted by erk
    I presume the old protein is still hogging the upload servers.
    I checked the Sneakers stats and the last update was only 16M points for both old and new. That's about 1/4 of what the server was doing while bogged down with the last protein. So this problem is not due to server load or a flood of old results. There is a problem somewhere.
    :

  34. #34
    Boinc'ing away
    Join Date
    Aug 2002
    Location
    London, UK
    Posts
    982
    0900 update was ~13.5 million, 1100 update was just under 18 million, 1300 update was just over 20 million - so the rates are going up - can't tell how many are new and old proteins as that isn't logged...

    This still low - even when the fast protein was clogging things up...hopefully the next few days will speed up - not to keen on having so many buffered gens...

  35. #35
    Senior Member
    Join Date
    Jul 2002
    Location
    Kodiak, Alaska
    Posts
    432
    everyone with a 908 error is now suffering from 910 errors.. So there's also tons of uploads from people that aren't adding points. (They need to clean up the bad stuff.. start a new directory, or clean up the old one - getting rid of the receipt.txt from the old client..)
    www.thegenomecollective.com
    Borging.. it's not just an addiction. It's...

  36. #36
    Registered User Morphy375's Avatar
    Join Date
    Jun 2003
    Location
    Regensburg, Germany
    Posts
    81
    No new ticket for the last 14 hours....
    1st set complete....
    31 gens uploaded....
    hm.....

  37. #37
    Junior Member
    Join Date
    Apr 2002
    Location
    Brum, UK
    Posts
    6
    Maybe its just me - but all my machines are buffering like crazy - less then 10% of the gens have been uploaded .

    After reading a fair bit of posting about the new system and trying a delete of receipt.txt on one rig I have the following observation :

    All my buffered rigs have a receipt.txt with a IP of either 192.168.10.106 or 192.168.10.110.

    If I delete receipt.txt and try uploading , I am able to upload results UNTIL ( u guessed it ) - I get a result ticketed from either of the two afore mentioned IPs.

    Consulsion - those two servers are either overloaded or fubar'd.

    Anyone else able to verify these findings ?

    moz

  38. #38
    Registered User Morphy375's Avatar
    Join Date
    Jun 2003
    Location
    Regensburg, Germany
    Posts
    81
    Don't delete the receipt.txt! It may result in a 910 error....

  39. #39
    Originally posted by Morphy375
    Don't delete the receipt.txt! It may result in a 910 error....
    Hmm I was getting pages of 910 errors without deleting the receipt.txt.
    OCAU

  40. #40
    moz, I've got .110 and getting no uploads.

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •