High Availability Strategy for Oracle EPM Version 11

As EPM usage reaches global users, implementing Oracle EPM in a highly available fashion is becoming a major part of infrastructure planning.  Let’s go through a couple of key considerations when considering High Availability for your organization.

Are you sure?

The first question I ask clients is if they really need it.  I find that sometimes IT organizations are trying to start implementing load balancing and H/A as part of a “corporate standard” or some made up Sarbanes Oxley requirement.   In reality, in many cases unplanned downtime, although obviously not welcome, can be tolerated in organizations in the event of a catastrophic hardware failure.

The best bet is to work with the Finance to truly understand their availability times.  Then, take a look at your backup/recovery plan, and your  hardware vendor agreements for on-site emergency part replacement.  Some clients have a 2-hour guarantee field replacement by their hardware vendor, and even full replacement parts on-site.  Think of all the scenarios of unplanned downtime and apply a probability to it to accurately asses the total risk of a non-redundant implementation.

If that does not make you take pause – think cost.  A highly available installation will double your hardware cost and almost triple your implementation time.  There also could be additional licensing costs associated with redundant components.

Don’t try this at home

If after all of that and you decide that H/A is mandatory in your organization, go for it!  But beware – do not try this at home.  A Highly available installation should only be performed by an Oracle Certified infrastructure partner.

High Availability in EPM 11

The biggest change in Oracle’s stance for high availably in version 11 is the dropped support for all 3rd party clustering solutions such as Veritas Cluster Server and Microsoft Cluster Server. … which is quite unfortunate.  The only supported clustering methodology is Oracle Clusterware.  This is really bad news for those IT shops that already have an older system deployed on a different clustering technology. It also is especially frustrating because Oracle Clusterware requires the use of Oracle Cluster File System (OCFS) for shared disk resources….which on windows can take 5 minutes to failover. Nice.

Terminology

First lets talk terminology, at least how I will define these terms for the use of this posting.

High Availability
The ability to continue to provide computing resources in the event of a fatal hardware failure.

Cluster

Two or more linked machines that are used for Load Balancing and/or failover of services for high availability

Load Balancing

Distributing requests among multiple applications servers to evenly distribute load.  Many times a Load Balancer is used as the single entry point and it will distribute requests based on load of the individual servers or in a simple round-robin fashion. The nice thing about load balancing, is that most of the time, you also get High Availability.

Failover
The ability to automatically switch services to a standby server if the primary server fails.

The EPM Strategy for High Availability

Each component of the EPM enterprise has a different approach to H/A. Understanding how each component works determines the best methodology.  We implement H/A using the following as a guide.

1.)  Load Balance when you can

Load Balancing

Load balancing gives you the best of both worlds – distributing load for performance, and hardware fault tolerance.

This is naturally suited for Web Components.  Many of the BI Web components are, in essence, stand alone web sites packaged and contained in a Java Application Server (WebLogic, WebSphere, etc).  These sites simply respond to requests and can be load balanced.

  • Workspace
  • Web Analysis
  • Financial Reporting Web Components
  • Planning
  • Analytic Provider Services

2.)    Use product built-in clustering when you can

Some components have the built in concept of clusters, some have built in round robin load balancing. It’s best to take advantage of that when you can and let the product handle the high availability.

  • Built in Clusters (Must still supply a load balancer)
    • Financial Management
    • Financial Data Quality Management
  • Built in load balancing
    • Interactive Reporting
    • Production Reporting
    • PDF Print Server

3.)    3rd Party failover only when you have to

Cluster

Some of the other components of the EPM system do not load balance or cluster well.  I call these “The Highlanders” because there can be (or should be) only one.

HH

There can be only one Highlander

  • Foundation
  • Financial Reporting Scheduler Server
  • Essbase Administration Services
  • Essbase Integration Services
  • EPMA Dimension Server
  • Essbase Studio

Notes on Essbase

Sure if you look at the High Availability support matrix, Oracle is happy to say that Essbase can be clustered and Load Balanced.  But look closely… that is in read only mode only.  While there are a few situations a read-only Essbase cluster makes senses for an organization, I almost always see this limitation making the Essbase cluster solution useless. It certainly will not work for Planning.

To make matters worse, Oracle will not support 3rd party clustering of any kind on Essbase – not even Oracle Clusterware.  While this is supposed to be address in the next release in Sept/Nov, it leaves us little choice on the Essbase layer.

Our options are to:

Do it anyway. But just know that if Oracle determines that a support issue is related to the cluster, they have every right to insist that you demonstrate the issue without it in the mix before they will take responsibility.  However, for most behavioral or performance issues that you would call support about, Oracle should not know or even care that your Essbase is running on top of a cluster.

Do an active/Cold stand-by. In essence have a powered-off machine configured with the same hostname/IP address connected to the same shared disk resource as the active machine. Upon failure, there will be a manual process of shutting down the active server (if needed), dismounting the disk resource, starting up the cold server, mounting the disk resource, and bringing services online.

Notes on OBIEE

OBIEEI have to say that OBIEE is pretty flexible and robust for high availability.  There are many components to OBIEE, and each can be configured independently for load balancing as needed.  The problem is that each MUST be configured independently. The Flexibility is great, but it can be a challenge to set up for the novice.  And, I’m not sure how necessary it all is.  Popular methodology for OBIEE is to install all components on every server in the cluster and load balance between them.  It would be nice to have a more graphical, built in, approach to that…but in the mean time, we have loads of fun editing INI and XML files.

The point is that OBIEE can be fully redundant and fault tolerant, and using a certified partner, don’t be afraid to insist it in your organization.  It works.

If you would like to discuss H/A in your environment, please contact me.

~ by Eric Helmer on August 19, 2009.

4 Responses to “High Availability Strategy for Oracle EPM Version 11”

  1. it was very informative, Eric.
    Thanks & Regards,
    Vijay Sai

  2. Awesome! .. Cool stuff!!
    But I’m gonna try this at home on 11.1.2 – not a big deal.

    John,

  3. Hi Eric,
    Great post on EPM; as a strategy guy working with global finance to better understand EPM, your words are very useful – showing great clarity of thought and underlying expertise.

    Thanks,

    Mikeg

  4. Hi Eric,
    thanks for the post , very clear and informative , i’ll be dealign with H/A on a project soon and would like to have your opinion and help if u can , i’m a newbe in the field and this project is my first experiance , just trying to find some knowledge sources to help to have a good start!. thanks.

    Zak

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: