Fail-Over and Disaster Recovery
Mirror copies of all client data are maintained on specialised mirror servers which are updated every hour. Each application server has both an on-site mirror (same data centre, different server) as well as an off-site mirror (different data centre, different server). All client data is encrypted while in transit.
In the event of a loss of a server or data centre, whether as a result of hardware failure, power failure or communication failure, we have a comprehensive fail-over process in place. We make extensive use of pre-configured virtual servers which allows us to provision a new PPO application server within minutes in any one of our data centres.
In conjunction with the mirror process previously described, this allows us to move any or all clients to an alternate server, data centre or hosting country within a very short period of time. This process is continuously tested as part of disaster recovery preparedness but is also used routinely when upgrading our hardware or doing load distribution.
What / how is this process communicated / managed with the customer during such a “incident”?
The PPO application is continuously monitored from an off-site location using a specialised service provider. If the PPO application does not respond within 3 minutes, automatic SMS's and e-mails are sent to multiple support staff who then kick off a response plan based on a set escalation procedure.
If the application is unavailable for a period of time, users are kept informed of the situation and progress in resolution via Twitter, and e-mail notifications are sent to key client contacts if the problem persists.
In severe cases or upon request the PPO Support Team will provide an incident report detailing the cause of the failure, measures taken to correct and resolve the issue and prevention methods implemented to prevent a similar failure in future.
However, based on the fail-over and mirroring processes described above, we take great care not to lose any data during hardware failure, power failure or communication failure and the greatest impact on users is that PPO may be unavailable for a limited period of time.