Thursday, February 11, 2010

VMware PEX: Site Recovery Manger "Up and Running"

This VMware Partner Exchange Session (PEX) was Site Recovery Manger "Up and Running" (TECHBC0321).  I'm writing as I go so this might be a little messy.

This session will focus on the problems typically encountered during SRM implementations.

Required Components
2x vCenter servers
2x SRM Servers
Replication Product from the Storage Vendor
SRA (Storage Array Replication) from the Storage Vendor

Install Workflow
1. vCenter at each site
2. SRM Server (seperate server)
3. SRM needs a DB instance
4. SRA (Often the most complex and causes the most problems) - Install on SRM server

SRA's Function
  1. Setup - Query for replicated luns, match luns to vm inventory
  2. Failover - Automates promotion of LUNS at remote site, and 
  3. Testing - LUN Snapshot creation
What can go wrong during SRA install?
  • Not all SRA's are create equal.  Each one is different and have different levels of effort put into the development Some require additional framework (Java JRE for example)  Always read all release notes and the install guide prior to the install attempt
  • Always download a fresh SRA FROM THE VMWARE SITE NOT THE VENDOR SITE, many vendors change versions on a frequent basis
  • Whatever you do on one site, do it on the other site
  • When configuring SRA at the protected site, it may fail if not all components are installed at the recovery site (not configured, just installed)
  • What if no datastores appear but the SRA seems to be installed OK?  This is because the datastore doesn't have any vm's on it
  • Always verify you have all the needed license features on BOTH storage systems to fully support replication in BOTH directions
Design Considerations
  • Disparate networks (re-ip of servers) - Most Common
  • Stretch vlans (no re-ip of servers) - Less Common
  • DNS services
  • Active Directory services - Could be dedicated for testing and failover or same production AD 
  • Considered Applications with Hard Coded IP's
  • Remember Default Gateway and Subnet Mask
  • When performing a recovery, the less changes the better (DOC-1491 in VMware Communities
  • SRM Supports RDM's but it isn't recommended
  • If using multiple virtual hard disks, make sure both of them are replicated (or exist) at both locations
  • SRM does not support replicating virtual machines with snapshots
  • Need port 80 https tunnel between sites for site pairing (it is encrypted but travels on port 80 instead of 443 to make security easier
  • 150 protection groups / 1000 protected vm's
  • A protection group can hold consist of datastores if a virtual machine spans datastores

No comments: