I spend most of my professional life consulting for Internet/Mobility/SaaS companies across Southeast Asia, MiddleEast and US as MySQL Architect, I am accountable for MySQL Infrastructure Operations, Site Reliability, Performance, Scalability and High Availability of my customer MySQL Infrastructure. My customers are from diversified industries like Mobile Advertisement Networks, Online Commerce, Mobile Social Media Gaming and SaaS, The DATA is business for them and Database Infrastructure Outage is the worst thing which could ever happen so “Maximum MySQL Availability” is something I am very serious about !!! Being an MySQL consultant I get to meet new customers almost every week and most of them ask me a very common question “How we can be Optimal, Scalable an Highly Available ? ” , I will ask them back “what can be maximum duration (in hours) of MySQL outage you could ever afford? ” Many of them will reply “We have to be available 24*7*365”, WOW, Great to hear but in real life it’s IMPOSSIBLE !!! Being an independent MySQL consultant I am very serious about my professional reputation and I always feel comfortable in sharing good, bad and ugly side of MySQL infrastructure operations! There are some customers who appreciate being upfront and some who consider me pessimistic but eventually I will be signing common MySQL Infrastructure Operations Management SLA for all of them mentioning Possible MySQL Outage Scenarios , Time-To-Recover , Recovery & Healing, HealthCheck/Diagnostics/Forensics, Performance, Scalability and Emergency Response / On-Call . In this post I am addressing Cost-of-Data-Outage.
Cost-of-Data-Outage is expensive for every corporation from all directions like revenue, brand reputation and eventually customer experience. Think about travel booking site going offline during holiday season or Ad-Network going down during shopping festival, The damage is not just limited to web/mobile property but even to strategic partners involved in it like hotels signed-up with travel site or E-commerce platform partnered with Ad-Network so everyone consider “Data Outage” serious but can you ever promise anyone 100% availability of your data infrastructure, The honest answer will be NO!!! This doesn’t mean we are ignorant about “Data Outage” but to architect & implement “Maximum Availability Architecture” you need to know Cost-of-Data-Outage. Building Self Healing / Auto Failover, Fault Tolerant, Highly Available and Responsive MySQL Infrastructure Operations is expensive ($$$), Resource Intensive and Operationally Complex so plan carefully for Budget, People, Roadmap and Housekeeping/Operations while concluding MySQL Maximum Availability Strategy.
What are possible Data Outage scenarios in MySQL Infrastructure?
The following are few possible (most common) Data Outage Scenarios in a MySQL Infrastructure:
- Human error: The most common and even more dangerous because there is no limit (intentional or not) for an individual/group to damage your data.
- Business: Often the potential internet marketing and consumer psychology are mistaken/underestimated. The poor capacity planning & sizing can bring down your MySQL infrastructure in no time !!!
- Natural Catastrophe: This could be just anything Flood, Earthquake, Tsunami, Fire etc. Man still get defeated often by mother nature unfortunately !!!
- Planned Outage: The advancement in Hardware is so compelling (PCIe SSDs are much faster than HDDs) so upgrades are unavoidable
What I consider while planning for MySQL Maximum Availability ?
I never standardise MySQL Infrastructure Operations (Performance, Scalability High Availability and 24*7 MySQL DBA Services ) for any of my customers!! They are completely different from each other for me (though all of them use either MySQL GA / Percona Server / MariaDB / WebScaleSQL). The following are few questions which I will have the answer before engaging personally to MySQL Infrastructure Operations of my customers:
- MySQL flavour (MySQL GA / Percona Server / MariaDB / WebScaleSQL)
- The current MySQL DBA Operations, SLA, Emergency Support and On-Call DBA
- The current MySQL data growth rate / day
- MySQL Infrastructure Operations Trending Report (generally two weeks of history data will be sufficient in most of the cases )
- MySQL Health Check, Diagnostics & Forensics Report
- Immediate goals & accomplishments for MySQL Infrastructure Operations
- MySQL Infrastructure Operations Roadmap
How typical MySQL Maximum Availability Project Plan will look like ?
*** This is only the to show my readers how typical Maximum MySQL Availability Project Plan will look like !!!
- 24*7 Service Operations (Monitoring & Trending MySQL ), MySQL Infrastructure Operations Workflow (Emergency Support & On-Call) and MySQL Operations SLA (escalation and maximum accepted duration in closing tickets successfully)
- Planned MySQL HealthCheck, Diagnostics and Forensics Audit to be proactive !!
- Planned MySQL Tuning Schedule
- MySQL Backup Strategy, Execution and Sanity Testing of MySQL Backups
- MySQL High Availability Architecture, Implementation and Operations
- Planned MySQL Outage (hardware upgrades)
- MySQL Upgrade Plan
Now what is really the Cost-of-Data-Outage ?
The answer for this question is not just limited to a number but much beyond that, It is about Customer Experience, Employee Satisfaction (people hate to work for frequently going down websites), Investor Faith, Partner Loyalty, Brand Value and last but not the least REVENUE!!!
What should I do to achieve Maximum MySQL Availability ?
- Hire / Contract professional MySQL DBA or MySQL Consulting company ?
- Invest for your in-house MySQL DBA training / research , MySQL is growing faster so staying updated is key to gain maximum benefits
- Sponser your MySQL DBAs for Conferences, They enjoy learning and networking with peers and industry experts