Page 1 of 2 12 LastLast
Results 1 to 40 of 48

Thread: FDCPS Severe Outrages thread

  1. #1
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948

    Exclamation FDCPS Severe Outrages thread

    **** CONNECTION DETAILS IN THIS THREAD ARE NOT CURRENT ****
    *** PLEASE REFER TO THE "How to get started thread" > here < ****


    COMPLETED - Planned outage 14-15 nov

    There will be a planned outage of the FDCPS project server at the end of this week. 14-15 NOV.

    The purpose of the outage is to relocate the project to a new server.

    Preparations have been made to make this as simple for the participants as possible.

    If your llrnet client config file is addressing the server as 'primesearch.free-dc.org' then you should not need to change anything.

    In order to minimise any errors we have shortened the knpairs file on IB-7773.
    The pairs that have been removed are ready for your clients on the new server.
    IB-7773 will be run dry (ETA 13 NOV), allowing your clients to empty their cache and the server to capture the last of the results from that port.

    At that point your client will become idle.

    Just leave your client running and when the DNS change is complete, it will resume normally.

    Once the project DNS change is implemented your client will begin picking up new work from the new server.

    The project DNS change may take 24-48 hours depending on the rate of refresh of the DNS cache of your ISP.

    Exceptions:
    1) If the client config is set up to address the server by the IP address, you should change it to the project server name instead 'primesearch.free-dc.org'.
    (Refer to the "How to get started" thread)
    2) If you are running the client under Unix/Linux and have added the project server to your hosts file you will need to update your hosts file with the new IP address.
    3) If you run a local DNS server / cache you may wish to force a DNS update


    Any changes to this release will be posted here.

    Thank you for your patience during this change.

    AMDave
    Last edited by AMDave; 11-22-2011 at 11:58 PM. Reason: Thread merge
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  2. #2
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    The transition occurred a bit earlier than planned.
    The DNS cascade was much faster than expected.

    Examining the transfer rates per participants per hour it looks like all clients carried across without a hitch.

    The stats export from the new server yesterday did not cantain some data from the old server leading up to the transition. This resulted in a 'negative' stats update for some participants. However, the new server has now been updated with the data from the old server and the stats export has made up the difference, returning the stats to the normal and correct values.

    Because of unexpected the speed of the DNS cascade, some pairs were left unprocessed on IB-7773. Those pairs have been made available on PCZ-7774 for clean-up.

    Cheers to PCZ, Bok and Beyond for the help in the transition.

    That pretty much concludes the transition.

    Thank you for your patience.


    A very special "Thank You" to Ironbits for the incentive, inspiration, ideas and infrastructure that got FDCPS to where it is.
    He's a Champion!
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  3. #3
    Yup, everything seems to switch over pretty well, except that the new server still isn't handing out many primes...

  4. #4
    DinkaTronic Shish's Avatar
    Join Date
    May 2005
    Location
    Gateshead UK
    Posts
    882
    I'm only running an x58 with 8 instances but I keep a small cache so nothing was even noticed. Congrats to the squad for a succesful, not too? stressful move. All your work is appreciated, especially IB for getting me into it, even if he is on sabbatical or retired . Can't stay on holiday for ever bud, come back soon cos I actually miss you.....
    My lowly thanks to AMDave, and the usual crew of Bok,Pcz and Beyond for your tireless work on behalf of Free-DC and DC in general. Dunno where you guys find the time and energy cos I'm retired and I can't find any of either
    Like an ol` 8086, slow but serviceable.
    One advantage of old age...nobody can tell you how much cake you can eat


  5. #5
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948

    FDCPS Project is back on-line

    /ed- Thread title updated to reflect new status. See thread posts below -ed/

    The FDCPS Project is currently offline.
    The outage was planned to relocate to an alternate host.
    However the new server is experiencing an unplanned connectivity issue that may take a day or two more to be resolved.
    Further information will be posted when available.
    Last edited by AMDave; 03-31-2011 at 10:54 AM.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  6. #6
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    Update - Progress is being made
    The broadband connection has been restored.
    Middle-ware upgrades are in progress and dynamic testing is in progress.
    I'll update next when access will be restored.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  7. #7
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    lots happening.
    migrated the site from 32 bit arch to 64 bit arch to take full advantage of 8GB ram
    moved to a different and slightly faster OS distro
    upgraded all middleware to current stable versions and latest patches
    that forced more than a few code changes to the web site and server scripts
    re-factored a lot of code while going through the spaghetti
    retired ChartDirector and introduced ajax flot charts
    most of the bugs are knocked out
    located a port 80 block in the WAN that "is not supposed to exist" and worked around it
    just a couple more issues (2, I think) to work through before we can resume this crunch-fest
    after that, tuning and miscellaneous bug fixes will resume in the background
    back soon ...
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  8. #8
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    LAN and MAN testing completed.
    WU's are being completed and processed correctly.
    The server seems pretty stable over the last 6 days.
    WAN testing commencing.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  9. #9
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    DNS changes completed and verified.

    In the words of the great Duke Nukem, "Come get some!"

    FDCPS is open for business.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  10. #10
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    The old address cannot be forwarded to an A-name address and I cannot get a fixed IP address in a short period of time.

    So, for the foreseeable future the FDCPS project will be on http://fdcps.no-ip.org/

    You will need to change the hostname in your llr-clientconfig.txt file to "fdcps.no-ip.org" to get pairs from the ports.

    Please post here if you have any issues.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  11. #11
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    The FDCPS server is running but a planned network change did not go to completely to plan yesterday so it is not available yet.
    The old DSL modem couldn't handle the higher rate any more and failed to hold sync.
    The new router is working fine but it won't update the dynamic DNS automatically.
    I should have the work-around completed today.
    But fast?
    The connection updgrade is completed and the connection speed is much higher (d/l 5x, u/l 1.2x) and the available usage ceiling is much higher too (4x)
    So well worth a few troubles of adding a DNS work-around.
    Last edited by AMDave; 06-17-2011 at 11:31 PM.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  12. #12
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    There was a local unplanned power outage this morning that lasted for about 5 hours.
    When the server came back up the clock was wrong and it performed the next roll-over before it was scheduled to.
    I fixed the clock and re-tagged the extra set of files so they won't get overwritten when tonight's roll over occurs on schedule.
    Happily the only person inconvenienced was me
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  13. #13
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    In around 16 hours from now, there will be a planned FDCPS outrage of about 1 hour to complete a number of software upgrades and hardware maintenance.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  14. #14
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948

    Updated version of the [URL="http://fdcps.no-ip.org/stats/index.php"]FDCPS stat site[

    Updated version of the FDCPS stats site has been deployed
    • CSS3 changes completed
    • HTML5 changes completed (except for one single anchor I have left for later)
    • Page filtering fixed
    • Page sorting fixed
    • a bucket load of syntax errors have been tossed out in the gutter
    • Some admin security & page changes completed
    • Added the server status page so we can see ports that are current but not active
    • Added additional blocks to the drive progress chart
    • Tweaked the pie charts on the dashboard

    Tested in:
    Safari, Chrome, Chromium, Firefox, Epiphany, Arora, Midori, Seamonkey Opera, IE9 and elinks

    If you spot any bugs, please call a pest controller. You really shouldn't let them breed in your house
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  15. #15
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    unplanned outage happening right now
    maybe 1 hour
    depends on this spectacular lightning storm
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  16. #16
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    This extreme weather outage is over.

    That is to say ... This extreme weather outage is now over someone else's house. :P
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  17. #17
    It has been a while, okay a long time, but easing back into it.

    Thanks Dave for keeping everything up and running.
    -:Beyond:-


  18. #18
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    My pleasure.
    No really.
    I got 3 primes early on.
    None since then, but that's ok because they were good ones
    Great to see your post.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  19. #19
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    unplanned power outage took the server down for about 5 minutes.
    back up and running. tests ok.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  20. #20
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    There will be a planned outage for several hours tomorrow while electrical work is carried out.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  21. #21
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    Planned outage concluded without incident.
    Updates and patches applied before restart.
    All good.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  22. #22
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    There will be a planned FDCPS outage for further mains electrical work to be carried out.
    Planned start is 12 hours from now.
    Planned end is 17 hours from now.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  23. #23
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    FDCPS planned outage completed on schedule.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  24. #24
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    Full DR plan execution and tests for FDCPS completed successfully at 2012-05-20.
    Because DR procedures are not 'good' unless you keep verifying that they work.

    The FDCPS server PSU fan bearings made some bad noise while at max RPMs during a heat spike about a week ago for a few hours, so it was time to check on a few things.
    I made a few updates to the rebuild document during the DR test, but there is one more item outstanding, so I'll commit that back into the project tomorrow.
    The project server runs well in a KVM cluster so I'm planning to migrate it to a KVM guest back on the same hardware soon-ish to get all the benefits that brings.
    I also verified that the implementation works with Nginx . (very snappy btw)
    I'll add that config to the doco for the commit tomorrow.

    It went so well I am still looking for the "Gotcha!"
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  25. #25
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    The FDCPS site and ports may be unavailable for a few minutes in 3 hours time [16:30pm AEST and 06:30 UTC] while I migrate it to another machine to allow the server maintenance and OS upgrade.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  26. #26
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    Quote Originally Posted by AMDave View Post
    ...It went so well I am still looking for the "Gotcha!"
    FOUND IT!

    That went badly.
    The network bridge to the VM version of the server failed.
    The version to version upgrade of the OS on the server box failed because the server network configuration failed between 10.10 and 11.10 so it lost all network connectivity and failed to boot, meaning that the next OS version upgrade was a dead-stick.

    I am now doing a clean install on the server box: ubuntu-12.04-alternate-amd64

    Having tested the DR plans I have the tried and tested recovery path so it's just a matter of time.
    Once the OS is running I'll have to rebuild the databases, web and mail servers etc.
    Not confident of getting it all done tonight, but it's on the way.

    In spite of copious testing, sometimes things that can go wrong still do go wrong.

    Damn you to the infernal depths of hell, Murphy!
    Last edited by AMDave; 06-03-2012 at 06:37 AM.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  27. #27
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    back on line.
    time for some r&r
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  28. #28
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    DR process re-tested and updated.
    More steps added. Added appendices for choice of Apache / Nginx config
    Site now running under Nginx.
    Still running on bare metal.
    LXC / KVM version planned.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  29. #29
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948

    FDCPS server - planned outage 1 hour

    FDCPS server - planned outage 1 hour
    FROM 09:00 AEST
    TO 10:00 AEST
    For OS upgrade and security updates.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  30. #30
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    Planned outage completed successfully.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  31. #31
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    FDCPS outage:
    ISP maintenance activity is scheduled affecting local, national and international access.
    It appears to have started several hours early so the window is now much longer.
    Expect problems accessing the server for the next 9-10 hours.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  32. #32
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    unplanned outage happening right now
    maybe 4 hours
    large lightning storms - http://info.energex.com.au/lightning...xtern_7765.gif
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  33. #33
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    FDCPS is back online.
    Threw in a new case fan as one of them was an ex-parrot.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  34. #34
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    5 hour unplanned outage due to electrical storm activity
    server is now back to normal operations
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  35. #35
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    The FDCPS server will be intermittently unavailable for up to the next 3 hours during planned maintenance.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  36. #36
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    planned outage is complete.
    it took a little bit longer to tidy up some config items.
    The server is a little bit quicker at everything now.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  37. #37
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    FDCPS server is down due to an unplanned lightning strike which has upset more than just a few electrons.
    Tomorrow I'll migrate the services to another machine, then see if I can recover the hardware.
    ITMT - apologies for the outage.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  38. #38
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    FDCPS server restored.
    Sorry about the down time.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  39. #39
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    Unplanned power outage in progress.
    Large area is in blackout. No ETF.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  40. #40
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    Unplanned outage of FDCPS project.
    The server is not out.
    The site is not out.
    The comms are not out.
    Microsoft has hijacked NO-IPs 23 free domains & DNS, with the permission of a US judge, to 'filter' out a few bad hosts and in (predictably) failing to do so appropriately has created a denial of service to pretty much everyone with a no-ip dynamic DNS.
    ETF - unknown. Somewhere around the same time as someone important gets the judge out of bed to revoke the order probably.
    Hosting a site from the southern hemisphere on an OS that has nothing to do with M$ does not keep you out of the M$ blast zone.
    I object that a US judge deems that they can sign off that M$ can highjack an international highway to identify and stop the defective 'vehicles' that they built and sold instead of demanding they issue a recall.
    This is no april fools day.

    No-IP - https://www.noip.com/blog/2014/06/30...soft-takedown/
    SlashDot - http://yro.slashdot.org/story/14/07/...tm_medium=feed
    ArsTechnica - http://arstechnica.com/security/2014...no-ip-domains/

    The offender - http://blogs.technet.com/b/microsoft...isruption.aspx

    The fail:
    "In the meantime, NO-IP / Vitalwerks have published their answer online:
    Apparently, the Microsoft infrastructure is not able to handle the billions of queries from our customers. Millions of innocent users are experiencing outages to their services because of Microsoft’s attempt to remediate hostnames associated with a few bad actors.”

    Hopefully normal service will resume shortly.
    Last edited by AMDave; 07-01-2014 at 09:51 AM.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •