BACKUP and Disaster Recovery
Have a Plan for When Things Go Wrong (because they will)
BLOG ARTICLE
What's your plan if your software, system, server, database or connectivity suddenly becomes unavailable?
This question was asked recently for a new customer on-boarding at Cloud365.
To their advantage, we were already in the process of migrating their application from another cloud provider and had a recent copy of their databases on our platform, however it was not considered production ready due to data being a restored backup from several days ago.
On Friday afternoon we took a backup of everything from their previous hosting provider, restoring it in our staging/test environment shortly before close of business so it could be tested over the weekend.
Over the same weekend their previous hosting provider experienced a service interruption, taking their application and databases offline. In fact, they were still offline Monday morning.
Luckily the customer had an offsite backup from overnight Friday which we were able to obtain and restore into our staging environment where the new customer was already testing.
This resulted in an "accelerated" go live, switching production services from their old hosting provider to Cloud365 on Monday at around lunch time. Whilst the process was a little quicker than planned, it enabled our new customer to get their service back online as the previous hosting provider was still offline.
To their credit, the customer did have a business continuity plan and did execute part of that plan around data and services, however it raised questions that we were happily able to answer.
The first question was around redundancy for their application and data. We had the application redundancy taken care of via real time replication of their application front and back-end between our Sydney and Melbourne datacentres. This effectively allows customers to run multiple production or production/staging/test environments, removing dependency on a single location or region. Tick.
Then this question from the customer's IT Manager:
What happens if our database is offline and we can't access backups?
Here's where our partnership with HP benefits our customers via a purpose-built platform we call our 'Backup Robots', located at each datacentre facility. Running on new generation Hewlett Packard Enterprise Class Server infrastructure, it's essentially a site based localised backup platform that automatically executes backups as often as required, replicating data sets to other datacentre locations or customer premises.
Most importantly, customers have direct access to their backups.
We utilise this platform to ensure customer data is backed up, reporting directly to our customers so they too have peace of mind that backups have completed successfully and data sets have been transferred to their target location.
Some customers use this service to replicate production databases to staging and test environments.
The Backup Robot service is included with our service, free of charge.
Needless to say the customer was pretty happy with the outcome - their business was back online and the same issue could not occur again since their application and databases were now replicated between datacentres in real-time, with multiple point in time database backups taken throughout each day, including weekends.
We were able to obtain another incremental backup containing the missing customer and transaction data from the weekend which was later applied to the customer's database.
The whole experience certainly raised awareness around application and data integrity, of which we take seriously and have the tools/systems in place to enable this for our customers.
Does your hosting provider backup your applications and databases as part of your service ?
Do you have a plan B if your site is offline such as real-time failover or Disaster Recovery ?
If these questions are a concern or you need help with an application or website please reach out and we'll talk through some ideas that may suit your situation.
Thanks for reading, may your backups be in place and restore-able.
Darren Moss is a senior platform architect with more than 20 years experience designing, building and managing enterprise application infrastructure for banking, telecommunications, broadcast and cloud service providers. Darren is the General Manager of Cloud365 Asia Pacific region, heading up a team of Infrastructure Support experts in Melbourne, Sydney and Singapore.