RPO I get no respect

When considering any backup solution it is important to assess two key metrics, RPO & RTO.  I put more value on RPO then RTO and here is why.

RPO (Recovery Point Objective) represents how frequently backups are taken, or put another way, how much data you are willing to lose.

RTO (Recovery Time Objective) The duration of time and a service level within which a business must be restored after a disaster or disruption.

RPO  needs to be taken EXTREMELY serious.  You must know where updates to your database are coming from.  Do you have control over reproducing those updates in the event of an outage or disruption to your business systems?

One of my recent HA projects was for an online catalog company. Orders came in from three different sources:  websitecall center and mail order.  Their total orders averaged 800 per hour.

In the event of an outage, there is simply no way for them to reproduce their web orders.  These web orders represented one third of their orders and they had an RPO of 24 Hours (meaning backups were being done every 24 hours). With approximately 6400 orders and an average order size of $100.00, their total loss could approach $640,000.00 for this 24 hour period.

Their phone orders are entered while a customer service rep is on the phone with the customer.  If an outage were to occur, they cannot reenter those phone orders since they are entered live into the system and do not have a way to recreate them. So this would add another loss of 6400 orders at $100.00 per order.  Adding another loss of $640,000.00, would bring our total loss to $1,280,000.00 for a 24 hour RPO.

But they can replay their mail orders.  So in 24 Hours, with a total of 19,200 orders they could only recreate 6400 of them.  Prior to our HA project, their RPO was 24 Hours. They now have HA.

Another recent project was a municipality with 13,000 utility customers has internet access to pay bills, get copies of printed bills, sign up for paperless billing, view past history, etc.

“During downtime, any transactions would need to be recorded by hand,” “Utility bills are a major source of revenue and could not be reproduced, which would affect cash flow.”  There is no way they can reproduce these transaction.  They now have HA.

Here are some examples of data transactions to help understand your control of RPO:

  1. Web Transaction
  2. EDI transactions
  3. Forklifts with Scanners updating inventory locations.
  4. Phone Orders
  5. SQL connection from other servers
  6. FTP
  7. ODBC connections (Hint for IBM Power i. Look at QUSRWRK)

 

Employees are a company’s most valuable asset, Second only to your DATA. Period!  You insure your people and your physical assets, so why not insure that your valuable data assets are protected as well? Your business data & records could be gone in an instant due to user error, hardware failure, theft, or some other threat.

BCI or BII (Business Continuance or Business Interruption Insurance)

 First and foremost, contact your Business insurance company to understand their BCI/BII policy.  Insurance companies know that your company’s data is your most valuable asset.  Discuss your intentions with them about implementing a High Availability solution.  Since you are reducing their risk, most likely they will reduce your premiums.

This savings could give you the budget to fund an HA project.  A President of a company told me:  “If my building is gone so is my business.”  Not true.  Building and data centers can be rebuilt, your data is priceless.  Not only is it your data, it is your cash, your receivables, a valuable asset, will be safe & offsite, giving you access to needed receivables (cash) to rebuild your business.

 

  How much time can be allowed to recover your system in the event of an outage?

I like to ask my clients, “If I pull the power cord on your IBM Power System i, how long will it take for you to put a gun to my head to plug it back in?  And there is your RTO.

I have been involved in disasters at Banks, Hospitals and 911 centers. It is amazing how resourceful we can be in a disaster.  Yes it is stressful getting everything running again.  But the first question always, “Is our data intact and current?” … then you see the sigh of relief.