Results 1 to 40 of 48

Thread: FDCPS Severe Outrages thread

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    Full DR plan execution and tests for FDCPS completed successfully at 2012-05-20.
    Because DR procedures are not 'good' unless you keep verifying that they work.

    The FDCPS server PSU fan bearings made some bad noise while at max RPMs during a heat spike about a week ago for a few hours, so it was time to check on a few things.
    I made a few updates to the rebuild document during the DR test, but there is one more item outstanding, so I'll commit that back into the project tomorrow.
    The project server runs well in a KVM cluster so I'm planning to migrate it to a KVM guest back on the same hardware soon-ish to get all the benefits that brings.
    I also verified that the implementation works with Nginx . (very snappy btw)
    I'll add that config to the doco for the commit tomorrow.

    It went so well I am still looking for the "Gotcha!"
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  2. #2
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    The FDCPS site and ports may be unavailable for a few minutes in 3 hours time [16:30pm AEST and 06:30 UTC] while I migrate it to another machine to allow the server maintenance and OS upgrade.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  3. #3
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    Quote Originally Posted by AMDave View Post
    ...It went so well I am still looking for the "Gotcha!"
    FOUND IT!

    That went badly.
    The network bridge to the VM version of the server failed.
    The version to version upgrade of the OS on the server box failed because the server network configuration failed between 10.10 and 11.10 so it lost all network connectivity and failed to boot, meaning that the next OS version upgrade was a dead-stick.

    I am now doing a clean install on the server box: ubuntu-12.04-alternate-amd64

    Having tested the DR plans I have the tried and tested recovery path so it's just a matter of time.
    Once the OS is running I'll have to rebuild the databases, web and mail servers etc.
    Not confident of getting it all done tonight, but it's on the way.

    In spite of copious testing, sometimes things that can go wrong still do go wrong.

    Damn you to the infernal depths of hell, Murphy!
    Last edited by AMDave; 06-03-2012 at 05:37 AM.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  4. #4
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    back on line.
    time for some r&r
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  5. #5
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    DR process re-tested and updated.
    More steps added. Added appendices for choice of Apache / Nginx config
    Site now running under Nginx.
    Still running on bare metal.
    LXC / KVM version planned.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  6. #6
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948

    FDCPS server - planned outage 1 hour

    FDCPS server - planned outage 1 hour
    FROM 09:00 AEST
    TO 10:00 AEST
    For OS upgrade and security updates.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

  7. #7
    Administrator AMDave's Avatar
    Join Date
    Sep 2004
    Location
    deep in a while-loop
    Posts
    1,948
    Planned outage completed successfully.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    -----------------------------------------

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •